Résumé
Multi-Instance GPU (MIG) xarala NVIDIA la buy dagg benn GPU physique ci xaaj hardware yu bari yuñ tàqale. Dafa am solo ndax dafay tax benn accelerator bu seer mëna def ay liggéey yu ndaw yu bari ci benn yoon te duñu jaxasoo seen biir.
Xaajale GPU ci misaal yu bari, jumtukaay la buy tabax xarala yu am njeexital ci kalite model bi, njëgu infrastructure bi, latency bi, ak wóor gi ci escale bi.
Plongeur bu xóot
Ñu ngi ko dugal ci NVIDIA A100 (Ampere) ba noppi wéyal ko ci H100 ak GPU yu bees yi, MIG dafay xaaj GPU ci juróom ñaari instance yu moom seen bopp. MIG wuute na ak losisel biy dagg waxtu, ndax dafay joxe isolation hardware bu dëggu: instance bu nekk amna multiprocesseur (SMs) boppam, daggitu cache L2, daggitu mémoire, ak daggitu mémoire bu am bandwidth bu kawe. A100 bu am 40GB mën nañu ko xaaj juróom ñaari instance yu 5GB, wala yu gëna néew yu gëna mag. Partition bu nekk dafay doxalee ni GPU bu gëna ndaw, kon liggéey bu bari bruit wala bu tass ci benn instance mënul xiifloo wala yàq beneen. Kalite-serwiis buñ garanti bi moo tax MIG nekk lu baax ci serwiis inference, cluster yu bari-luwekat, ak environmaa developpement yu jëfandikukat yu bari bokk benn kàrt.
Gis-gis xarala
MIG dafay liggéey ci gating physiquement crossbar bi ci biir GPU bi suko defee instance bu nekk am yoon wi koy yóbbu ci memory slice ak SMs. NVIDIA dafay màndargaal profil yi ni ay fraksioŋ yu melni 1g.5gb (benn xaaj ordinatër, 5GB) ba 7g.40gb. GPU instance dafay denc mémoire ak SMs; ci biir, Instance Compute dafay xaaj SM yi gëna. Ndax xaaj yi dañuy tënku ci hardware, njuumte yi, njuumti ECC yi, ak yaatuwaayu bandwidth memory bi dañuy des ci benn instance.
Xam xaaj GPU ci anam yu bari
Multi-Instance GPU (MIG) xarala NVIDIA la buy dagg benn GPU physique ci xaaj hardware yu bari yuñ tàqale. Dafa am solo ndax dafay tax benn accelerator bu seer mëna def ay liggéey yu ndaw yu bari ci benn yoon te duñu jaxasoo seen biir. Xaajale GPU ci misaal yu bari, jumtukaay la buy tabax xarala yu am njeexital ci kalite model bi, njëgu infrastructure bi, latency bi, ak wóor gi ci escale bi. Ngir tabax xam-xam bu xóot, jàppal Multi-Instance GPU Partitioning ni xeetu liggéey, du benn man-man: leeral njariñ yi nga bëgg, leeral xalaat yi, ak tàqale li sistem bi mëna def ci anam wu wóor ak li ba leegi soxla àtteb kàngam.
Ci jëf, ekip yu am doole yiy jëfandikoo Xaajale GPU yu bari-instance dañuy gëna baaxal architecture, done, ak tànneefi infrastructure ci wàllu wóor ak njëg. Dañuy bind kritër yu leer ngir am ndam, natt leen ci done yu dëggu ak def liggéey, ba noppi ñu baamtu ci anamu ñàkka mëna seetlu, du ci benn yoon benchmark wins. Mooy barab bi xam-xam theorie bi di soppiku nekk kàttan buy yàgg ci produit yi, ci politik yi ak ci liggéey yi.
Dogal yi architecture di jël dañuy indi njariñ ak njëgu liggéey bi ay at ci ginaaw. Ci jamano jooju, Optimisation benn benchmark mën na nëbb ñakk kattan yu gëna yaatu ci sistem bi. Xeetu jëf bi gëna dëgër mooy boole gaawaayu jàngat ak disipline nguur: doxal pilote, jàpp firnde, siiwal dogal yi, ak wéy di yeesal kaaraange gi ci anam wi ñuy doxalee, li jëfandikukat bi di xaar, ak sàrti sàrt yi di jëm kanam.
njeextalu pexe
Dogal yi architecture di jël dañuy indi njariñ ak njëgu liggéey bi ay at ci ginaaw.
Dogal yi architecture di jël dañuy indi njariñ ak njëgu liggéey bi ay at ci ginaaw. Ci jëfandikoo yu am kalite bu kawe, loolu dañu koy tekki ci sàrti liggéey yuñ mëna natt, ay peggu boroom, ak ay xew-xewu xoolaat yu bari suko defee ekip yi mëna yokk wóolu seen bopp ci barabu yokk lu jaxasoo.
Njàngalem xarala yi dafay jàppale ekip yi ñu tànn li gën, te baña yam ci li gëna bees daal.
Njàngalem xarala yi dafay jàppale ekip yi ñu tànn li gën, te baña yam ci li gëna bees daal. Ci jëfandikoo yu am kalite bu kawe, loolu dañu koy tekki ci sàrti liggéey yuñ mëna natt, ay peggu boroom, ak ay xew-xewu xoolaat yu bari suko defee ekip yi mëna yokk wóolu seen bopp ci barabu yokk lu jaxasoo.
Tanneef yu gëna baax ci wàllu ingeñër dina wàññi jafe-jafe yi ci wàllu wóor ci liggéey bi.
Tanneef yu gëna baax ci wàllu ingeñër dina wàññi jafe-jafe yi ci wàllu wóor ci liggéey bi. Ci jëfandikoo yu am kalite bu kawe, loolu dañu koy tekki ci sàrti liggéey yuñ mëna natt, ay peggu boroom, ak ay xew-xewu xoolaat yu bari suko defee ekip yi mëna yokk wóolu seen bopp ci barabu yokk lu jaxasoo.
Doxal ci àdduna dëgg
Benn fournisseur cloud dafay xaaj benn A100 ci juróom ñaari instance suko defee juróom ñaari kiliyaan yi ku nekk am benn GPU buñ garanti, buñ tàqale ngir inference.
Ab kuréel gëstukat bu daara ju kawe dafay jox ndongo bu nekk ci PhD yi 10GB MIG ngir prototyping ludul monopolise kàrt lëmm.
Benn sarwiisu inference dafay boole làkk yu ndaw yu bari ak modeli gis-gis ci benn H100, bu nekk ci xaaj boppam ak latency buñu mëna seetlu.
Benn cluster Kubernetes dafay siiwal misaali MIG ni ay jumtukaay yuñ mëna waajal, moo tax pod yi dañuy laaj 'nvidia.com/mig-1g.5gb' ni yeneen jumtukaay yi.
Modèlu jëfandikoo
Xaajale GPU yu bari ci jëf
Benn fournisseur cloud dafay xaaj benn A100 ci juróom ñaari instance suko defee juróom ñaari kiliyaan yi ku nekk am benn GPU buñ garanti, buñ tàqale ngir inference.
Benn fournisseur cloud dafay xaaj benn A100 ci juróom ñaari anam, suko defee juróom ñaari kiliyaan yi ku nekk am benn garanti, GPU slice ngir inference Teams yi dañuy faral di am njariñ yu gëna baax suñu joxee thresholds yu baax ci kanam, tëye yoonu escalation nit ngir jafe-jafe yi, ba noppi topp njariñu produit yi ak njuumte yi.
Xaajale GPU yu bari ci jëf
Ab kuréel gëstukat bu daara ju kawe dafay jox ndongo bu nekk ci PhD yi 10GB MIG ngir prototyping ludul monopolise kàrt lëmm.
Benn kuréel gëstukat bu universite dafay jox ndonngo bu nekk ci PhD benn misaalu 10GB MIG ngir prototyping ci barabu monopolizing kàrt lëmm. Ekip yi dañuy faral di am njariñ yu gëna baax suñu joxee threshold yu baax ci kanam, tëye yoonu escalation nit ngir jafe-jafe yi, ba noppi topp njariñu produit ak njëgu njuumte ci diir bi.
Xaajale GPU yu bari ci jëf
Benn sarwiisu inference dafay boole làkk yu ndaw yu bari ak modeli gis-gis ci benn H100, bu nekk ci xaaj boppam ak latency buñu mëna seetlu.
Ab sarwiisu inference dafay boole ay làkk yu ndaw ak ay xeetu gis-gis ci benn H100, bu nekk ci xaaj boppam ak latency buñu mëna wax. Ekip yi dañuy faral di am njariñ yu gëna baax suñu joxee ay threshold yu baax ci kanam, tëye yoonu escalation nit ngir jafe-jafe yi, ba noppi topp njariñu produit ak njëgu njuumte ci diir bi.
Xaajale GPU yu bari ci jëf
Benn cluster Kubernetes dafay siiwal misaali MIG ni ay jumtukaay yuñ mëna waajal, moo tax pod yi dañuy laaj 'nvidia.com/mig-1g.5gb' ni yeneen jumtukaay yi.
Kubernetes cluster dafay siiwal misaali MIG ni ay jumtukaay yuñ mëna waajal, kon pods yi dañuy laaj 'nvidia.com / mig-1g.5gb' ni yeneen jumtukaay yi.
Risk yi ak balustrade yi
Optimize benn benchmark mën na nëbb ñakk kattan yu gëna yaatu ci sistem bi.
Njëg li ñuy fay ci infrastructure yi ak ci toppatoo dañuy faral di suufeel.
Bu sistem yi di gëna xawa jafee xam, jafe-jafe yi am ci wàllu kaaraange ak seetlu mën nañu gëna bari.
Roadmap ngir samp gi
Mandargal latency, kalite, ak njëg yi laata ngay jëfandikoo.
Mandargal latency, kalite, ak njëg yi laata ngay jëfandikoo. Japp jéego bu nekk ni buntu firnde: sudee mattul kritër yi, noppali génne gi, tëj bërëb bi, ba noppi nga yaatal jëfandikoo gi.
Benchmark ci biir sargal ak done yu dëggu.
Benchmark ci biir sargal ak done yu dëggu. Japp jéego bu nekk ni buntu firnde: sudee mattul kritër yi, noppali génne gi, tëj bërëb bi, ba noppi nga yaatal jëfandikoo gi.
Jumtukaay bi di saytu njuumte yi, derive bi ak njeextalu jëfandikukat bi.
Jumtukaay bi di saytu njuumte yi, derive bi ak njeextalu jëfandikukat bi. Japp jéego bu nekk ni buntu firnde: sudee mattul kritër yi, noppali génne gi, tëj bërëb bi, ba noppi nga yaatal jëfandikoo gi.
Waajal rollback ak yooni tontu ci jafe-jafe yi laata ngay eskale.
Waajal rollback ak yooni tontu ci jafe-jafe yi laata ngay eskale. Japp jéego bu nekk ni buntu firnde: sudee mattul kritër yi, noppali génne gi, tëj bërëb bi, ba noppi nga yaatal jëfandikoo gi.