Technical GUIDE

Optimizer State Offloading kuCPU uye NVMe

Chirevo chekuchengetedza ndangariro chinomisa kuchengetwa kwebhuku kunorema kwekudzidziswa (optimizer states, gradients, dzimwe nguva uremu) muCPU RAM kana paNVMe SSDs pane kushomeka GPU memory.

Overview

Chirevo chekuchengetedza ndangariro chinomisa kuchengetwa kwebhuku kunorema kwekudzidziswa (optimizer states, gradients, dzimwe nguva uremu) muCPU RAM kana paNVMe SSDs pane kushomeka GPU memory. Inoita kuti vanhu vadzidzise mamodheru akakurisa kupfuura ndangariro dzeGPU yavo yaizobvumira.

Optimizer State Offloading kuCPU neNVMe inyanzvi yekuvaka inobata mhando yemhando, mutengo wezvivakwa, latency, uye kuvimbika pachiyero.

Deep Dive

Paunodzidzisa neural network ine optimizer senge Adamu, imwe neimwe parameter inotakura mukwende wakawedzerwa: maviri anomhanya manhamba (kumhanya uye musiyano), pamwe neiyo yakazara-chaiyo kopi yehuremu, pamwe negradient yayo. Mukudzidziswa kwakasanganiswa-chaizvo izvi zvinogona kuita zvinosvika gumi nematanhatu mabhayiti paparamita, kudikitira mabhayiti maviri ehuremu hwacho. Kuburitsa kutakura mutoro iwoyo kubva paGPU. CPU inoburitsa hova optimizer inopinda mune yakajairwa system RAM pamusoro pePCIe bhazi, ukuwo NVMe kuburitsa ichivasundidzira nzira yese kudzika kukurumidza solid-state disks. Yakasimudzirwa neDeepSpeed's ZeRO-Infinity uye ZeRO-Offload, nzira iyi inotengeserana nekumhanya kwehuwandu, ichirega imwe GPU kana diki diki-tune modhi ine mabhiriyoni emaparamita.

Technical Insight

Chinokosha ndechekupindirana kwekufamba kwedata nekombuta. Optimizer inoti inogara muCPU/NVMe; panguva yekudzokera kumashure, zvikamu zvinofanotorwa pamusoro pePCIe nguva isati yadiwa uye iyo optimizer nhanho pachayo inowanzo mhanya paCPU. ZeRO-Offload inochengeta float32 master huremu uye Adam nguva paCPU, saka yekumberi nekumashure chete math inogara paGPU. NVMe inowedzera tiered cache kuitira kuti terabyte-scale inoti iparare kune diski nepo zvikamu zvinopisa zvichigara muRAM.

Mastering Optimizer State Offloading kuCPU uye NVMe

Chirevo chekuchengetedza ndangariro chinomisa kuchengetwa kwebhuku kunorema kwekudzidziswa (optimizer states, gradients, dzimwe nguva uremu) muCPU RAM kana paNVMe SSDs pane kushomeka GPU memory. Inoita kuti vanhu vadzidzise mamodheru akakurisa kupfuura ndangariro dzeGPU yavo yaizobvumira. Optimizer State Offloading kuCPU neNVMe inyanzvi yekuvaka inobata mhando yemhando, mutengo wezvivakwa, latency, uye kuvimbika pachiyero. Kuti uvake kunzwisisa kwakadzama, bata Optimizer State Offloading kuCPU neNVMe semuenzaniso wekushandisa, kwete chinhu chimwe chete: tsanangura zvaunoda mhedzisiro, kujekesa fungidziro, uye patsanura izvo zvingaitwe nehurongwa nekuvimbika kubva kune zvichiri kuda kutonga kwenyanzvi.

Mukuita, zvikwata zvakasimba zvinoshandisa Optimizer State Offloading kuCPU uye NVMe inokwidziridza zvivakwa, data, uye sarudzo dzezvivakwa zvinopesana nekuvimbika uye mutengo. Ivo vanonyora zvakajeka maitiro ebudiriro, bvunzo vachipokana ne data rechokwadi uye mafambiro ebasa, uye iterate zvichibva pane zvakacherechedzwa maitiro ekutadza kwete kuhwina-nguva imwe chete yebhenji. Apa ndipo apo kunzwisisa kwe theoretical kunoshanduka kuve kugona kwakasimba pane chigadzirwa, mutemo, uye mashandiro.

Zvisarudzo zvezvivakwa zvinotyaira kuita uye mutengo wekushandisa kwemakore. Panguva imwecheteyo, Kukwirisa imwe bhenji kunogona kuvanza yakafara system kushaya simba. Nzira yakatsiga ndeyekubatanidza kukurumidza kuyedza nekutonga: mhanyisa vatyairi vendege, tora humbowo, buritsa matanda esarudzo, uye urambe uchivandudza chengetedzo semaitiro emuenzaniso, zvinotarisirwa nemushandisi, uye zvinodikanwa zvekutonga.

Strategic Impact

Zvisarudzo zvezvivakwa zvinotyaira kuita uye mutengo wekushandisa kwemakore.

Zvisarudzo zvezvivakwa zvinotyaira kuita uye mutengo wekushandisa kwemakore. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Dzidzo yehunyanzvi inobatsira zvikwata kusarudza murwi wakakodzera, kwete iwo mutsva chete.

Dzidzo yehunyanzvi inobatsira zvikwata kusarudza murwi wakakodzera, kwete iwo mutsva chete. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Sarudzo dzeinjiniya dziri nani dzinoderedza zviitiko zvekuvimbika mukugadzira.

Sarudzo dzeinjiniya dziri nani dzinoderedza zviitiko zvekuvimbika mukugadzira. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Ramangwana reOptimizer State Offloading kuCPU uye NVMe

Sezvo mamodheru achiramba achikura ndangariro yeGPU, kurodha kwakapetwa kuri kuita kwakajairwa kwete kwekunze. Tarisira kubatanidzwa kwakasimba nekukurumidza kusanganisa seNVLink-C2C neCXL ndangariro madziva anodzima muganho weCPU-GPU, pamwe neanoronga kuronga anofanotaura kuti ndezvipi zvinofanirwa kufanotorwa. Akabatana-memory architecture akadai saGrace Hopper anoderedza chirango chePCIe, uye masisitimu ari kusundira akananga kugadzira akawanda-tier kurodha zvakada kujeka kuitira kuti hobbyists vagone kukwenenzvera mamodheru pane zvine mwero Hardware.

Real-World Implementation

Kunyatsogadzirisa 13-bhiriyoni-parameter LLM pane imwechete 24 GB mutengi GPU uchishandisa DeepSpeed ​​​​ZeRO-Offload kusundira Adam nyika kuCPU RAM.

Iyo diki yekutsvagisa lab inodzidzisa akawanda-mabhiriyoni-parameter modhi pane mashoma maGPU nekudurura optimizer nyika kuNVMe dhiraivha neZeRO-Infinity.

Hugging Face Kurumidza zvigadziriso zvinogonesa CPU kurodha kuitira kuti vashandisi vakwanise kumhanyisa akazara-tuning mabasa ayo angadai akakanda kunze-kwe-memory zvikanganiso.

Mutengo-anoziva kutanga kuhaya zvakachipa, yakaderera-yekuyeuka gore maGPU uye kurodha kune yakanamatira NVMe pane kubhadhara yepamusoro-tier 80 GB makadhi.

Maitiro Ekuita

Optimizer State Offloading kuCPU uye NVMe mukuita

Kunyatsogadzirisa 13-bhiriyoni-parameter LLM pane imwechete 24 GB mutengi GPU uchishandisa DeepSpeed ​​​​ZeRO-Offload kusundira Adam nyika kuCPU RAM.

Kugadzirisa 13-bhiriyoni-parameter LLM pane imwechete 24 GB mutengi GPU uchishandisa DeepSpeed ​​​​ZeRO-Offload kusundira Adhama nyika kuCPU RAM Matimu anowanzo kuwana mhedzisiro iri nani kana achinge atsanangura emhando yepamusoro kumberi, chengeta nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa uye mutengo wekukanganisa nekufamba kwenguva.

Optimizer State Offloading kuCPU uye NVMe mukuita

Iyo diki yekutsvagisa lab inodzidzisa akawanda-mabhiriyoni-parameter modhi pane mashoma maGPU nekudurura optimizer nyika kuNVMe dhiraivha neZeRO-Infinity.

Idiki yekutsvagisa lab inodzidzisa akawanda-mabhiriyoni-parameta modhi pane mashoma maGPU nekudurura optimizer nyika kune NVMe madhiraivha neZeRO-Infinity Matimu anowanzo kuwana mibairo iri nani kana ivo vachitsanangura hunhu hwepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa uye mutengo wekukanganisa nekufamba kwenguva.

Optimizer State Offloading kuCPU uye NVMe mukuita

Hugging Face Kurumidza zvigadziriso zvinogonesa CPU kurodha kuitira kuti vashandisi vakwanise kumhanyisa akazara-tuning mabasa ayo angadai akakanda kunze-kwe-memory zvikanganiso.

Hugging Face Kurumidza zvigadziriso zvinogonesa CPU kuburitsa kuitira kuti vashandisi vakwanise kuita basa rakazara-tuning iro raizokanda kunze-kwe-memory zvikanganiso Matimu anowanzo kuwana mhedzisiro iri nani kana achinge atsanangura emhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa uye mutengo wekukanganisa nekufamba kwenguva.

Optimizer State Offloading kuCPU uye NVMe mukuita

Mutengo-anoziva kutanga kuhaya zvakachipa, yakaderera-yekuyeuka gore maGPU uye kurodha kune yakanamatira NVMe pane kubhadhara yepamusoro-tier 80 GB makadhi.

Mutengo-anoziva kutanga kuhaya zvakachipa, yakaderera-yekuyeuka gore maGPU uye kurodha kune yakasungirirwa NVMe pachinzvimbo chekubhadhara yepamusoro-tier 80 GB makadhi Matimu anowanzo kuwana mhedzisiro iri nani kana achinge atsanangura emhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa uye mutengo wekukanganisa nekufamba kwenguva.

Njodzi & Guardrails

!

Kugadzirisa imwe bhenji kunogona kuvanza yakafara system kushaya simba.

!

Infrastructure uye mari yekugadzirisa inowanzotarisirwa pasi.

!

Chengetedzo uye kucherechedzwa mapundu anogona kukura sezvo masisitimu anowedzera kuoma.

Implementation Roadmap

1

Tsanangura latency, mhando, uye mutengo zvinangwa usati waitwa.

Tsanangura latency, mhando, uye mutengo zvinangwa usati waitwa. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

2

Benchmark pasi pechokwadi mutoro uye data mamiriro.

Benchmark pasi pechokwadi mutoro uye data mamiriro. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

3

Chishandiso chekutarisa zvikanganiso, kudonha, uye mushandisi maitiro.

Chishandiso chekutarisa zvikanganiso, kudonha, uye mushandisi maitiro. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

4

Gadzirira nzira dzekudzosera kumashure uye dzezviitiko usati wawedzera.

Gadzirira nzira dzekudzosera kumashure uye dzezviitiko usati wawedzera. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

Ramba Uchiongorora