UMHLAHLANDLELA Wobuchwepheshe

I-Multi-Instance GPU Partitioning

I-Multi-Instance GPU (MIG) ubuchwepheshe be-NVIDIA obusika i-GPU eyodwa ebonakalayo ibe izingxenye eziningi ezihlukene zehadiwe.

Uhlolojikelele

I-Multi-Instance GPU (MIG) ubuchwepheshe be-NVIDIA obusika i-GPU eyodwa ebonakalayo ibe izingxenye eziningi ezihlukene zehadiwe. Kubalulekile ngoba ivumela isisheshisi esisodwa esibizayo ukuthi sisebenze imithwalo eminingi emincane ngesikhathi esisodwa ngaphandle kokuthi siphazamisane.

I-Multi-Instance GPU Partitioning iyibhulokhi yokwakha yobuchwepheshe ethinta ikhwalithi yemodeli, izindleko zengqalasizinda, ukubambezeleka, nokuthembeka esikalini.

I-Deep Dive

Yethulwe nge-NVIDIA A100 (Ampere) futhi yaqhubeka ku-H100 kanye nama-GPU amasha esikhungo sedatha, i-MIG iqopha i-GPU ibe yizimo ezizimele ezingafika kweziyisikhombisa. Ngokungafani nokusika isikhathi kwesofthiwe, i-MIG inikeza ukuhlukaniswa kwehadiwe kwangempela: isibonelo ngasinye sithola ama-multiprocessors aso azinikele (ama-SMs), izingcezu zenqolobane ye-L2, izilawuli zememori, nocezu olumisiwe lwenkumbulo yomkhawulokudonsa ophezulu. I-A100 ene-40GB ingahlukaniswa ibe yizimo ezingu-5GB eziyisikhombisa, noma ezimbalwa ezinkulu. Ingxenye ngayinye iziphatha njenge-GPU encane ezimele, ngakho-ke umsebenzi onomsindo noma ophahlazekayo esimweni esisodwa awukwazi ukulamba noma ukonakalisa esinye. Le khwalithi yesevisi eqinisekisiwe yenza i-MIG ibe ilungele ukukhonza okucatshangwayo, amaqoqo abaqashile abaningi, kanye nezindawo zokuthuthuka lapho abasebenzisi abaningi babelana ngekhadi elilodwa.

I-Technical Insight

I-MIG isebenza ngokufaka isango kubha yangaphakathi ye-GPU ukuze isibonelo ngasinye sibe nendlela eqondile eya kucezu lwaso lwememori nama-SM. I-NVIDIA ichaza amaphrofayili njengamafrakshini afana no-1g.5gb (ucezu lwekhompyutha olulodwa, 5GB) kufika ku-7g.40gb. I-GPU Instance igcina inkumbulo nama-SM; ngaphakathi kwayo i-Compute Instance ihlukanisa ama-SM ngokuqhubekayo. Ngoba ukwahlukanisa kuphoqelelwe ihadiwe, amaphutha, amaphutha e-ECC, kanye nomkhawulokudonsa wememori uhlala esimeni esisodwa.

I-Mastering Multi-Instance GPU Partitioning

I-Multi-Instance GPU (MIG) ubuchwepheshe be-NVIDIA obusika i-GPU eyodwa ebonakalayo ibe izingxenye eziningi ezihlukene zehadiwe. Kubalulekile ngoba ivumela isisheshisi esisodwa esibizayo ukuthi sisebenze imithwalo eminingi emincane ngesikhathi esisodwa ngaphandle kokuthi siphazamisane. I-Multi-Instance GPU Partitioning iyibhulokhi yokwakha yobuchwepheshe ethinta ikhwalithi yemodeli, izindleko zengqalasizinda, ukubambezeleka, nokuthembeka esikalini. Ukuze wakhe ukuqonda okujulile, phatha i-Multi-Instance GPU Partitioning njengemodeli yokusebenza, hhayi isici esisodwa: chaza imiphumela oyifunayo, ucacise ukucabanga, futhi uhlukanise lokho isistimu engakwenza ngokwethembeka kulokho okusadinga ukwahlulela kochwepheshe.

Empeleni, amaqembu aqinile asebenzisa i-Multi-Instance GPU Partitioning athuthukisa izakhiwo, idatha, nokukhetha kwengqalasizinda ngokumelene nokuthembeka nezindleko. Babhala imibandela yempumelelo ecacile, ukuhlola okuqhathaniswa nedatha engokoqobo nokugeleza komsebenzi, futhi baphindaphinde ngokusekelwe kumaphethini okuhluleka aqashiwe esikhundleni sokuwina kwebhentshimakhi yesikhathi esisodwa. Yilapho ukuqonda kwethiyori kuguquka kube amandla ahlala njalo kuwo wonke umkhiqizo, inqubomgomo, kanye nokusebenza.

Izinqumo zezakhiwo ziqhuba ukusebenza kanye nezindleko zokusebenza iminyaka. Ngesikhathi esifanayo, Ukuthuthukisa ibhentshimakhi eyodwa kungafihla ubuthakathaka obubanzi besistimu. Indlela eqine kakhulu iwukuhlanganisa isivinini sokuhlola nesiyalo sokuphatha: qhuba abashayeli bezindiza, bamba ubufakazi, ushicilele amalogi ezinqumo, futhi ubuyekeze izivikelo ngokuqhubekayo njengoba imodeli yokuziphatha, okulindelwe ngabasebenzisi, kanye nezimfuneko zokulawula zishintsha.

I-Strategic Impact

Izinqumo zezakhiwo ziqhuba ukusebenza kanye nezindleko zokusebenza iminyaka.

Izinqumo zezakhiwo ziqhuba ukusebenza kanye nezindleko zokusebenza iminyaka. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Imfundo yobuchwepheshe isiza amaqembu ukuthi akhethe isitaki esifanele, hhayi nje esisha.

Imfundo yobuchwepheshe isiza amaqembu ukuthi akhethe isitaki esifanele, hhayi nje esisha. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Izinketho ezingcono zobunjiniyela zinciphisa izehlakalo ezinokwethenjelwa ekukhiqizeni.

Izinketho ezingcono zobunjiniyela zinciphisa izehlakalo ezinokwethenjelwa ekukhiqizeni. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Ikusasa Le-Multi-Instance GPU Partitioning

Njengoba ama-GPU ekhula aze afike ku-80GB, 141GB, nangale kwalokho, ukwahlukanisa kuba kukhanga kakhulu ngoba amamodeli ngamanye awavamile ukudinga ikhadi eliphelele ukuze aqondiswe. Lindela i-Kubernetes eqinile kanye nokuhlanganiswa kwamafu, ukwahlukanisa okuguquguqukayo ngaphandle kokukhipha i-node, kanye namaphrofayili ahlotshiswe kahle. Abathengisi abaqhudelanayo baphishekela ukwenziwa okufanayo kwe-GPU yesitayela se-SR-IOV, futhi amapulatifomu angenasiphakeli ancike kakhulu ekwahlukaniseni ukupakisha amamodeli amaningi aminyene futhi asike imfucuza engasebenzi.

Ukuqaliswa Komhlaba Wangempela

Umhlinzeki wamafu uhlukanisa i-A100 eyodwa ibe yizimo eziyisikhombisa ukuze amakhasimende ayisikhombisa lilinye athole ucezu lwe-GPU oluqinisekisiwe, oluhlukanisiwe ukuze kucatshangwe.

Iqoqo locwaningo lwasenyuvesi linikeza umfundi ngamunye we-PhD isibonelo se-MIG esingu-10GB sokwenza i-prototyping esikhundleni sokuthatha wonke amakhadi.

Isevisi ye-inference ipakisha amamodeli ambalwa olimi amancane kanye nombono ku-H100 eyodwa, ngalinye lisendaweni yalo enokubambezeleka okungabikezelwa.

Iqoqo le-Kubernetes likhangisa izimo ze-MIG njengezinsiza ezihlelekayo ukuze ama-pods acele i-'nvidia.com/mig-1g.5gb' njenganoma iyiphi enye insiza.

Amaphethini Okusebenzisa

I-Multi-Instance GPU Partitioning in practice

Umhlinzeki wamafu uhlukanisa i-A100 eyodwa ibe yizimo eziyisikhombisa ukuze amakhasimende ayisikhombisa lilinye athole ucezu lwe-GPU oluqinisekisiwe, oluhlukanisiwe ukuze kucatshangwe.

Umhlinzeki wamafu uhlukanisa i-A100 eyodwa ibe yizimo eziyisikhombisa ukuze amakhasimende ayisikhombisa ngalinye athole ucezu lwe-GPU oluqinisekisiwe, oluhlukanisiwe lwe-inference Amaqembu ngokuvamile athola imiphumela engcono lapho echaza imingcele yekhwalithi ngaphambili, egcina indlela yokukhuphuka yomuntu yamacala asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-Multi-Instance GPU Partitioning in practice

Iqoqo locwaningo lwasenyuvesi linikeza umfundi ngamunye we-PhD isibonelo se-MIG esingu-10GB sokwenza i-prototyping esikhundleni sokuthatha wonke amakhadi.

Iqoqo locwaningo lwasenyuvesi linikeza umfundi ngamunye we-PhD isibonelo se-MIG esingu-10 se-prototyping esikhundleni sokusebenzisa amakhadi aphelele Amathimba ngokuvamile athola imiphumela engcono uma echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka komuntu emacaleni asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-Multi-Instance GPU Partitioning in practice

Isevisi ye-inference ipakisha amamodeli ambalwa olimi amancane kanye nombono ku-H100 eyodwa, ngalinye lisendaweni yalo enokubambezeleka okungabikezelwa.

Isevisi ye-inference ipakisha amamodeli ambalwa wolimi nombono kwi-H100 eyodwa, ngalinye ngokwehlukana kwalo eline-latency ebikezelwa Amathimba ngokuvamile athola imiphumela engcono lapho echaza imingcele yekhwalithi ngaphambili, egcina indlela yokukhuphuka komuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-Multi-Instance GPU Partitioning in practice

Iqoqo le-Kubernetes likhangisa izimo ze-MIG njengezinsiza ezihlelekayo ukuze ama-pods acele i-'nvidia.com/mig-1g.5gb' njenganoma iyiphi enye insiza.

Iqoqo le-Kubernetes likhangisa izimo ze-MIG njengezinsiza ezihlelekayo ukuze ama-pods acele i-'nvidia.com/mig-1g.5gb' njenganoma iyiphi enye imithombo Amathimba ngokuvamile athola imiphumela engcono lapho echaza imikhawulo yekhwalithi ngaphambili, egcina indlela yokukhuphuka kwabantu yamacala asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Izingozi & Guardrails

!

Ukuthuthukisa ibhentshimakhi eyodwa kungafihla ubuthakathaka obubanzi besistimu.

!

Izindleko zengqalasizinda nezokulungisa zivame ukubukelwa phansi.

!

Izikhala zokuphepha nokubonakala zingakhula njengoba izinhlelo ziba nzima kakhulu.

Ukuqalisa Umhlahlandlela

1

Chaza ukubambezeleka, ikhwalithi, nezindleko ezihlosiwe ngaphambi kokuqaliswa.

Chaza ukubambezeleka, ikhwalithi, nezindleko ezihlosiwe ngaphambi kokuqaliswa. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

2

Ibhentshimakhi ngaphansi komthwalo wangempela nezimo zedatha.

Ibhentshimakhi ngaphansi komthwalo wangempela nezimo zedatha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

3

Ukuqapha amathuluzi amaphutha, ukukhukhuleka, nomthelela wabasebenzisi.

Ukuqapha amathuluzi amaphutha, ukukhukhuleka, nomthelela wabasebenzisi. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

4

Lungiselela izindlela zokuhlehlisa nezigameko ngaphambi kokukala.

Lungiselela izindlela zokuhlehlisa nezigameko ngaphambi kokukala. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

Qhubeka Uhlole