Technical GUIDE

Multi-Instance GPU Partitioning

Multi-Instance GPU (MIG) iNVIDIA tekinoroji inocheka imwechete yemuviri GPU kuita akawanda ega Hardware partitions.

Overview

Multi-Instance GPU (MIG) iNVIDIA tekinoroji inocheka imwechete yemuviri GPU kuita akawanda ega Hardware partitions. Izvo zvine basa nekuti inoita kuti imwe inodhura accelerator ishumire akawanda madiki ebasa kamwechete pasina ivo kupindirana.

Multi-Instance GPU Partitioning inyanzvi yekuvaka inobata mhando yemhando, mutengo wezvivakwa, latency, uye kuvimbika pachiyero.

Deep Dive

Yakaunzwa neNVIDIA A100 (Ampere) uye yakaenderera mberi paH100 uye nyowani data-center GPUs, MIG inoveza GPU muzviitiko zvinomwe zvakazvimirira. Kusiyana nesoftware nguva-kucheka, MIG inopa yechokwadi hardware yekuzviparadzanisa: imwe neimwe chiitiko inowana yayo yakatsaurirwa kutenderera multiprocessors (SMs), L2 cache zvimedu, memory controllers, uye yakagadziriswa chidimbu cheyakakwira-bandwidth memory. Iyo A100 ine 40GB inogona kupatsanurwa kuita manomwe 5GB zviitiko, kana mashoma makuru. Chikamu chega chega chinoita seGPU diki yakamira, saka basa rine ruzha kana kuparara mune imwe nguva harigone kufa nenzara kana kukanganisa imwe. Iyi yakavimbiswa yemhando-ye-sevhisi inoita kuti MIG ive yakanaka kune inferensi yekushandira, akawanda-anoroja masumbu, uye nharaunda dzekusimudzira uko vashandisi vazhinji vanogovana kadhi rimwe.

Technical Insight

MIG inoshanda nekugezera iyo GPU yemukati muchinjiko kuitira kuti imwe neimwe nhanho ine nzira yakamisikidzwa kune yayo yega memory slice uye SMs. NVIDIA inotsanangura maprofayiri sezvikamu zvakaita se1g.5gb (imwe compute slice, 5GB) kusvika pa7g.40gb. A GPU Instance inochengetedza ndangariro uye SMs; mukati mayo Compute Instance inoparadzanisa maSMs zvakare. Nekuti zvikamu zvacho ndezve-hardware-enforced, zvikanganiso, ECC zvikanganiso, uye memory bandwidth inogara yakavharirwa kune imwechete muenzaniso.

Mastering Multi-Instance GPU Partitioning

Multi-Instance GPU (MIG) iNVIDIA tekinoroji inocheka imwechete yemuviri GPU kuita akawanda ega Hardware partitions. Izvo zvine basa nekuti inoita kuti imwe inodhura accelerator ishumire akawanda madiki ebasa kamwechete pasina ivo kupindirana. Multi-Instance GPU Partitioning inyanzvi yekuvaka inobata mhando yemhando, mutengo wezvivakwa, latency, uye kuvimbika pachiyero. Kuti uvake kunzwisisa kwakadzama, bata Multi-Instance GPU Partitioning semuenzaniso wekushandisa, kwete chinhu chimwe chete: tsanangura zvinodiwa, kujekesa fungidziro, uye patsanura izvo system inogona kuita nekuvimbika kubva kune izvo zvichiri kuda kutonga kwenyanzvi.

Mukuita, zvikwata zvakasimba zvinoshandisa Multi-Instance GPU Partitioning inogonesa zvivakwa, data, uye sarudzo dzezvivakwa zvinopesana nekuvimbika uye mutengo. Ivo vanonyora zvakajeka maitiro ebudiriro, bvunzo vachipokana ne data rechokwadi uye mafambiro ebasa, uye iterate zvichibva pane zvakacherechedzwa maitiro ekutadza kwete kuhwina-nguva imwe chete yebhenji. Apa ndipo apo kunzwisisa kwe theoretical kunoshanduka kuve kugona kwakasimba pane chigadzirwa, mutemo, uye mashandiro.

Zvisarudzo zvezvivakwa zvinotyaira kuita uye mutengo wekushandisa kwemakore. Panguva imwecheteyo, Kukwirisa imwe bhenji kunogona kuvanza yakafara system kushaya simba. Nzira yakatsiga ndeyekubatanidza kukurumidza kuyedza nekutonga: mhanyisa vatyairi vendege, tora humbowo, buritsa matanda esarudzo, uye urambe uchivandudza chengetedzo semaitiro emuenzaniso, zvinotarisirwa nemushandisi, uye zvinodikanwa zvekutonga.

Strategic Impact

Zvisarudzo zvezvivakwa zvinotyaira kuita uye mutengo wekushandisa kwemakore.

Zvisarudzo zvezvivakwa zvinotyaira kuita uye mutengo wekushandisa kwemakore. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Dzidzo yehunyanzvi inobatsira zvikwata kusarudza murwi wakakodzera, kwete iwo mutsva chete.

Dzidzo yehunyanzvi inobatsira zvikwata kusarudza murwi wakakodzera, kwete iwo mutsva chete. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Sarudzo dzeinjiniya dziri nani dzinoderedza zviitiko zvekuvimbika mukugadzira.

Sarudzo dzeinjiniya dziri nani dzinoderedza zviitiko zvekuvimbika mukugadzira. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Ramangwana reMulti-Instance GPU Partitioning

Sezvo maGPU anokura kusvika ku80GB, 141GB, uye nekupfuura, kupatsanura kunowedzera kutaridzika nekuti mamodheru ega haawanzoda kadhi rese rekufungidzira. Tarisira yakasimba Kubernetes uye kubatanidzwa kwegore, kupatsanurwa kune simba pasina kudhiza node, uye maprofiles akakwenenzverwa. Vatengesi vanokwikwidza vari kutevera yakafanana SR-IOV-maitiro GPU virtualization, uye serverless inference mapuratifomu anowedzera kuvimba nekuparadzanisa kurongedza akawanda mamodheru uye kucheka tsvina isina basa.

Real-World Implementation

Mupi wegore anotsemura imwe A100 kuita zviitiko zvinomwe kuitira kuti vatengi vanomwe mumwe nemumwe awane yakavimbiswa, yakasarudzika GPU chidimbu chekufungidzira.

Sangano rekutsvagisa yunivhesiti rinopa mudzidzi wega wega wePhD 10GB MIG muenzaniso weprototyping pachinzvimbo chekutonga makadhi akazara.

Sevhisi yekufungidzira inotakura akati wandei mitauro midiki uye mamodheru echiratidzo pane imwe H100, imwe neimwe muchikamu chayo ine inofanotaurwa latency.

A Kubernetes cluster inoshambadza MIG zviitiko sezvishandiso zvinorongwa saka mapods anokumbira 'nvidia.com/mig-1g.5gb' senge chero imwe sosi.

Maitiro Ekuita

Multi-Instance GPU Partitioning mukuita

Mupi wegore anotsemura imwe A100 kuita zviitiko zvinomwe kuitira kuti vatengi vanomwe mumwe nemumwe awane yakavimbiswa, yakasarudzika GPU chidimbu chekufungidzira.

Mupi wegore anotsemura A100 muzviitiko zvinomwe kuitira kuti vatengi vanomwe mumwe nemumwe awane yakavimbiswa, yakasarudzika GPU chidimbu chekufungidzira Matimu anowanzo kuwana mibairo iri nani kana achinge atsanangura zvikumbaridzo zvemhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa uye kukanganisa mutengo nekufamba kwenguva.

Multi-Instance GPU Partitioning mukuita

Sangano rekutsvagisa yunivhesiti rinopa mudzidzi wega wega wePhD 10GB MIG muenzaniso weprototyping pachinzvimbo chekutonga makadhi akazara.

Boka rekutsvagisa rekuyunivhesiti rinopa mudzidzi wega wega wePhD 10GB MIG muenzaniso wekuita prototyping pachinzvimbo chekutonga makadhi akazara Matimu anowanzo kuwana mibairo iri nani kana achinge atsanangura zvikumbaridzo zvemhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa uye mutengo wekukanganisa nekufamba kwenguva.

Multi-Instance GPU Partitioning mukuita

Sevhisi yekufungidzira inotakura akati wandei mitauro midiki uye mamodheru echiratidzo pane imwe H100, imwe neimwe muchikamu chayo ine inofanotaurwa latency.

Sevhisi yekufungidzira inotakura akati wandei mitauro midiki uye yechiratidzo modhi pane imwe H100, imwe neimwe muchikamu chayo neinofanofungidzira latency Matimu anowanzo kuwana mibairo iri nani kana achinge atsanangura emhando yepamusoro kumberi, chengeta nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa uye kukanganisa mutengo nekufamba kwenguva.

Multi-Instance GPU Partitioning mukuita

A Kubernetes cluster inoshambadza MIG zviitiko sezvishandiso zvinorongwa saka mapods anokumbira 'nvidia.com/mig-1g.5gb' senge chero imwe sosi.

A Kubernetes cluster inoshambadzira zviitiko zveMIG sezvishandiso zvinogoneka saka mapods anokumbira 'nvidia.com/mig-1g.5gb' sezvimwe zviwanikwa Matimu anowanzo kuwana mibairo iri nani kana achinge atsanangura hunhu hwepamberi, chengetedza nzira yekukwira kwevanhu yemakesi ekumucheto, uye kuteedzera zvese zvakawanikwa zvechigadzirwa uye mutengo wekukanganisa nekufamba kwenguva.

Njodzi & Guardrails

!

Kugadzirisa imwe bhenji kunogona kuvanza yakafara system kushaya simba.

!

Infrastructure uye mari yekugadzirisa inowanzotarisirwa pasi.

!

Chengetedzo uye kucherechedzwa mapundu anogona kukura sezvo masisitimu anowedzera kuoma.

Implementation Roadmap

1

Tsanangura latency, mhando, uye mutengo zvinangwa usati waitwa.

Tsanangura latency, mhando, uye mutengo zvinangwa usati waitwa. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

2

Benchmark pasi pechokwadi mutoro uye data mamiriro.

Benchmark pasi pechokwadi mutoro uye data mamiriro. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

3

Chishandiso chekutarisa zvikanganiso, kudonha, uye mushandisi maitiro.

Chishandiso chekutarisa zvikanganiso, kudonha, uye mushandisi maitiro. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

4

Gadzirira nzira dzekudzosera kumashure uye dzezviitiko usati wawedzera.

Gadzirira nzira dzekudzosera kumashure uye dzezviitiko usati wawedzera. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

Ramba Uchiongorora