Technical GUIDE

DeepSpeed uye Megatron Kudzidzisa Stacks

DeepSpeed (Microsoft) neMegatron-LM (NVIDIA) ndiwo masisitimu esoftware anoita kuti mamodheru ekudzidzisa ane mabhiriyoni emaparamita muzviuru zveGPU zvigoneke.

Overview

DeepSpeed (Microsoft) neMegatron-LM (NVIDIA) ndiwo masisitimu esoftware anoita kuti mamodheru ekudzidzisa ane mabhiriyoni emaparamita muzviuru zveGPU zvigoneke. Pasina ivo, mamodheru emazuva ano haakwanise kukwana mundangariro kana kupedza kudzidziswa nenguva yakafanira.

DeepSpeed ​​uye Megatron Kudzidzisa Stacks inyanzvi yekuvaka inobata mhando yemhando, mutengo wezvivakwa, latency, uye kuvimbika pachiyero.

Deep Dive

Kudzidzisa modhi hombe pane imwe GPU hazvigoneke nekuti uremu, magiradhi, uye optimizer nyika hazvikwane. Aya masheki akapatsanura basa kune akawanda maGPU. Megatron-LM yakapayona tensor parallelism, kuchekerera matrix ega ega mukati mechikamu chimwe nechimwe paGPUs, pamwe nepombi parallelism, iyo inoisa akasiyana maGPU akasiyana. DeepSpeed's siginecha mupiro ndeye ZeRO (Zero Redundancy Optimizer), iyo shards optimizer nyika, gradients, uye maparamita mhiri kweGPUs panzvimbo yekuzvidzokorora, kucheka per-GPU ndangariro zvinoshamisa. Iwo maviri anowanzo kusanganiswa (Megatron-DeepSpeed ​​​​) kudzidzisa mhando seBLOOM-176B uye Megatron-Turing NLG. Ivo zvakare vanowedzera yakasanganiswa-chaiyo, activation yekutarisa, uye kurodha kuCPU kana NVMe saka mahombe mamodheru anodzidzisa pane mashoma hardware.

Technical Insight

ZeRO ine nhanho nhatu dzekuwedzera kuchengetedza ndangariro: Nhanho 1 shards optimizer inoti, Nhanho 2 zvakare shards gradients, uye Stage 3 shards iwo maparamita pachawo, achiaunganidza pakuda panguva yekumberi nekumashure. Zvakasanganiswa ne tensor parallelism (intra-layer) uye pipeline parallelism (inter-layer), izvi zvinoumba '3D parallelism.' Iko kunetsa kwakakosha kutaurirana pamusoro: yega yega shard kupatsanurwa inowedzera GPU-kune-GPU traffic, saka mainjiniya anorongedza kupatsanurwa kuchengetedza nekukurumidza NVLink uye InfiniBand zvinongedzo zvakazara.

Mastering DeepSpeed uye Megatron Kudzidzisa Stacks

DeepSpeed ​​(Microsoft) neMegatron-LM (NVIDIA) ndiwo masisitimu esoftware anoita kuti mamodheru ekudzidzisa ane mabhiriyoni emaparamita muzviuru zveGPU zvigoneke. Pasina ivo, mamodheru emazuva ano haakwanise kukwana mundangariro kana kupedza kudzidziswa nenguva yakafanira. DeepSpeed ​​uye Megatron Kudzidzisa Stacks inyanzvi yekuvaka inobata mhando yemhando, mutengo wezvivakwa, latency, uye kuvimbika pachiyero. Kuti uvake kunzwisisa kwakadzama, tora DeepSpeed ​​​​uye Megatron Training Stacks semuenzaniso wekushandisa, kwete chinhu chimwe chete: tsanangura zvinodikanwa, kujekesa fungidziro, uye patsanura izvo system inogona kuita yakavimbika kubva kune ichiri kuda nyanzvi kutonga.

Mukuita, zvikwata zvakasimba zvinoshandisa DeepSpeed ​​uye Megatron Kudzidzisa Stacks inokwenenzvera zvivakwa, data, uye sarudzo dzezvivakwa zvinopesana nekuvimbika uye mutengo. Ivo vanonyora zvakajeka maitiro ebudiriro, bvunzo vachipokana ne data rechokwadi uye mafambiro ebasa, uye iterate zvichibva pane zvakacherechedzwa maitiro ekutadza kwete kuhwina-nguva imwe chete yebhenji. Apa ndipo apo kunzwisisa kwe theoretical kunoshanduka kuve kugona kwakasimba pane chigadzirwa, mutemo, uye mashandiro.

Zvisarudzo zvezvivakwa zvinotyaira kuita uye mutengo wekushandisa kwemakore. Panguva imwecheteyo, Kukwirisa imwe bhenji kunogona kuvanza yakafara system kushaya simba. Nzira yakatsiga ndeyekubatanidza kukurumidza kuyedza nekutonga: mhanyisa vatyairi vendege, tora humbowo, buritsa matanda esarudzo, uye urambe uchivandudza chengetedzo semaitiro emuenzaniso, zvinotarisirwa nemushandisi, uye zvinodikanwa zvekutonga.

Strategic Impact

Zvisarudzo zvezvivakwa zvinotyaira kuita uye mutengo wekushandisa kwemakore.

Zvisarudzo zvezvivakwa zvinotyaira kuita uye mutengo wekushandisa kwemakore. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Dzidzo yehunyanzvi inobatsira zvikwata kusarudza murwi wakakodzera, kwete iwo mutsva chete.

Dzidzo yehunyanzvi inobatsira zvikwata kusarudza murwi wakakodzera, kwete iwo mutsva chete. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Sarudzo dzeinjiniya dziri nani dzinoderedza zviitiko zvekuvimbika mukugadzira.

Sarudzo dzeinjiniya dziri nani dzinoderedza zviitiko zvekuvimbika mukugadzira. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Ramangwana reDeepSpeed uye Megatron Kudzidzisa Stacks

Tarisira kubatanidzwa kwakasimba nePyTorch's yekuzvarwa FSDP (Fully Sharded Data Parallel), iyo yakatora akawanda ZeRO mazano, kudzima mutsetse pakati petsvakiridzo stacks uye musimboti masisitimu. Compiler-driven approaches uye otomatiki parallelism kuronga vanovavarira kubvisa manyorero tuning. Sezvo masumbu ekudzidziswa achikura akananga kumazana ezviuru zveanokwidziridza, kushivirira kukanganisa, elastic scaling, uye kupindirana kutaurirana nemakomputa inova iyo inotungamira miganhu yeinjiniya, pamwe nerutsigiro rwehutsva hutsva seNVIDIA Blackwell uye tsika yekudzidzira machipisi.

Real-World Implementation

Kudzidzisa yakavhurika mitauro yakawanda BLOOM-176B modhi uchishandisa yakasanganiswa Megatron-DeepSpeed ​​stack mumazana emaGPU.

Microsoft neNVIDIA vachidzidzisa 530-bhiriyoni-parameter Megatron-Turing NLG modhi ine 3D parallelism.

ZeRO-Offload ichibvumira vaongorori kuti vagadzirise akawanda-mabhiriyoni-parameta modhi pane imwechete workstation GPU nekudurura optimizer nyika kuCPU RAM.

Kushandisa activation yekutarisa mune aya mastacks kuti ikwane kureba mamiriro windows nekudzokorodza ma activation pachinzvimbo chekuachengeta ese.

Maitiro Ekuita

DeepSpeed ​​uye Megatron Kudzidzisa Stacks mukuita

Kudzidzisa yakavhurika mitauro yakawanda BLOOM-176B modhi uchishandisa yakasanganiswa Megatron-DeepSpeed ​​stack mumazana emaGPU.

Kudzidzira yakavhurika yemitauro yakawanda BLOOM-176B modhi uchishandisa yakasanganiswa Megatron-DeepSpeed ​​stack mumazana eGPUs Matimu anowanzo kuwana mhedzisiro iri nani kana achinge atsanangura emhando yepamusoro kumberi, chengeta nzira yekukwira kwevanhu yemakesi emupendero, uye tarisa zvese zvakawanikwa zvekubereka uye kukanganisa mutengo nekufamba kwenguva.

DeepSpeed ​​uye Megatron Kudzidzisa Stacks mukuita

Microsoft neNVIDIA vachidzidzisa 530-bhiriyoni-parameter Megatron-Turing NLG modhi ine 3D parallelism.

Microsoft uye NVIDIA inodzidzisa iyo 530-bhiriyoni-parameter Megatron-Turing NLG modhi ine 3D parallelism Matimu anowanzo kuwana mibairo iri nani kana ivo vachitsanangura hunhu hwepamberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa uye mutengo wekukanganisa nekufamba kwenguva.

DeepSpeed ​​uye Megatron Kudzidzisa Stacks mukuita

ZeRO-Offload ichibvumira vaongorori kuti vagadzirise akawanda-mabhiriyoni-parameta modhi pane imwechete workstation GPU nekudurura optimizer nyika kuCPU RAM.

ZeRO-Offload ichibvumira vaongorori kuti vagadzirise-mabhirioni-parameter modhi pane imwe nzvimbo yekushandira GPU nekudurura optimizer nyika kuCPU RAM Matimu anowanzo kuwana mibairo iri nani kana vachitsanangudza mhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa uye kukanganisa mutengo nekufamba kwenguva.

DeepSpeed ​​uye Megatron Kudzidzisa Stacks mukuita

Kushandisa activation yekutarisa mune aya mastacks kuti ikwane kureba mamiriro windows nekudzokorodza ma activation pachinzvimbo chekuachengeta ese.

Uchishandisa activation yekutarisa mune aya mastacks kuti ikwane marefu mamiriro windows nekudzokorodza ma activation pachinzvimbo chekuvachengeta ese Matimu anowanzo kuwana mhedzisiro iri nani kana achinge atsanangura emhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa uye mutengo wekukanganisa nekufamba kwenguva.

Njodzi & Guardrails

!

Kugadzirisa imwe bhenji kunogona kuvanza yakafara system kushaya simba.

!

Infrastructure uye mari yekugadzirisa inowanzotarisirwa pasi.

!

Chengetedzo uye kucherechedzwa mapundu anogona kukura sezvo masisitimu anowedzera kuoma.

Implementation Roadmap

1

Tsanangura latency, mhando, uye mutengo zvinangwa usati waitwa.

Tsanangura latency, mhando, uye mutengo zvinangwa usati waitwa. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

2

Benchmark pasi pechokwadi mutoro uye data mamiriro.

Benchmark pasi pechokwadi mutoro uye data mamiriro. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

3

Chishandiso chekutarisa zvikanganiso, kudonha, uye mushandisi maitiro.

Chishandiso chekutarisa zvikanganiso, kudonha, uye mushandisi maitiro. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

4

Gadzirira nzira dzekudzosera kumashure uye dzezviitiko usati wawedzera.

Gadzirira nzira dzekudzosera kumashure uye dzezviitiko usati wawedzera. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

Ramba Uchiongorora