UMHLAHLANDLELA Wobuchwepheshe

Ukufana Kwedatha

Ukufana kwedatha kuqeqesha imodeli eyodwa ngokushesha ngokuyiphindaphinda kuwo wonke ama-GPU amaningi, nge-GPU ngayinye icubungula ucezu oluhlukile lwenqwaba yedatha.

Uhlolojikelele

Ukufana kwedatha kuqeqesha imodeli eyodwa ngokushesha ngokuyiphindaphinda kuwo wonke ama-GPU amaningi, nge-GPU ngayinye icubungula ucezu oluhlukile lwenqwaba yedatha. Kuyindlela yehhashi elivumela amaqembu ukuthi afinyelele kumaduzeni noma ezinkulungwaneni zama-accelerator.

I-Data Parallelism iyibhulokhi yokwakha yobuchwepheshe ethinta ikhwalithi yemodeli, izindleko zengqalasizinda, ukubambezeleka, nokuthembeka esikalini.

I-Deep Dive

Ekufaneni kwedatha, yonke i-GPU ibamba ikhophi efanayo yezisindo zemodeli kodwa icubungula iqoqo elincane elihlukile lezibonelo zokuqeqeshwa. Idivayisi ngayinye ibala iphasi eya phambili nangemuva ngokuzimela, ikhiqize isethi yayo yama-gradient. Ngaphambi kokubuyekezwa kwezisindo, ama-gradient alinganiswa kuwo wonke ama-GPU kusetshenziswa umsebenzi wokuxhumana wokunciphisa konke, ngakho-ke yonke ikhophi ihlala ivumelanisiwe futhi iziphatha njengokungathi iqeqeshelwe kubheshi eyodwa enkulu ehlanganisiwe. Lokhu kuphindaphinda ngempumelelo: Ama-GPU angu-8 angahlafuna cishe izikhathi eziyisi-8 zedatha ngesinyathelo ngasinye. Okubanjiwe ukuthi i-GPU ngayinye kufanele ilingane nayo yonke imodeli, ama-gradients ayo, nesimo se-optimizer enkumbulweni, ukuze ukufana kwedatha okusobala akusizi uma imodeli inkulu kakhulu kudivayisi eyodwa.

I-Technical Insight

Umsebenzi oyinhloko owokunciphisa konke, ohlanganisa ama-gradient kuwo wonke amadivayisi bese usabalalisa kabusha umphumela. Khalisa konke ukunciphisa, okusetshenziswa imitapo yolwazi efana ne-NCCL ne-Horovod, kudlula izingcezu ze-gradient eduze nendandatho enengqondo ukuze ukuxhumana okuphelele kuzimele ekubalweni kwe-GPU. I-PyTorch's DistributedDataParallel idlula lokhu kuxhumana nokudlula emuva, ikhiphe ukuvumelanisa kwe-gradient kuzendlalelo zangaphambili kuyilapho izendlalelo zakamuva zisasebenza ngekhompyutha, zifihla ukubambezeleka okuningi kwenethiwekhi.

I-Mastering Data Parallelism

Ukufana kwedatha kuqeqesha imodeli eyodwa ngokushesha ngokuyiphindaphinda kuwo wonke ama-GPU amaningi, nge-GPU ngayinye icubungula ucezu oluhlukile lwenqwaba yedatha. Kuyindlela yehhashi elivumela amaqembu ukuthi afinyelele kumaduzeni noma ezinkulungwaneni zama-accelerator. I-Data Parallelism iyibhulokhi yokwakha yobuchwepheshe ethinta ikhwalithi yemodeli, izindleko zengqalasizinda, ukubambezeleka, nokuthembeka esikalini. Ukuze wakhe ukuqonda okujulile, phatha i-Data Parallelism njengemodeli yokusebenza, hhayi isici esisodwa: chaza imiphumela efiselekayo, ucacise ukucabanga, futhi uhlukanise lokho uhlelo olungakwenza ngokwethembeka kulokho okusadinga ukwahlulela kochwepheshe.

Empeleni, amaqembu aqinile asebenzisa i-Data Parallelism athuthukisa ukwakheka, idatha, nokukhetha kwengqalasizinda ngokumelene nokuthembeka nezindleko. Babhala imibandela yempumelelo ecacile, ukuhlola okuqhathaniswa nedatha engokoqobo nokugeleza komsebenzi, futhi baphindaphinde ngokusekelwe kumaphethini okuhluleka aqashiwe esikhundleni sokuwina kwebhentshimakhi yesikhathi esisodwa. Yilapho ukuqonda kwethiyori kuguquka kube amandla ahlala njalo kuwo wonke umkhiqizo, inqubomgomo, kanye nokusebenza.

Izinqumo zezakhiwo ziqhuba ukusebenza kanye nezindleko zokusebenza iminyaka. Ngesikhathi esifanayo, Ukuthuthukisa ibhentshimakhi eyodwa kungafihla ubuthakathaka obubanzi besistimu. Indlela eqine kakhulu iwukuhlanganisa isivinini sokuhlola nesiyalo sokuphatha: qhuba abashayeli bezindiza, bamba ubufakazi, ushicilele amalogi ezinqumo, futhi ubuyekeze izivikelo ngokuqhubekayo njengoba imodeli yokuziphatha, okulindelwe ngabasebenzisi, kanye nezimfuneko zokulawula zishintsha.

I-Strategic Impact

Izinqumo zezakhiwo ziqhuba ukusebenza kanye nezindleko zokusebenza iminyaka.

Izinqumo zezakhiwo ziqhuba ukusebenza kanye nezindleko zokusebenza iminyaka. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Imfundo yobuchwepheshe isiza amaqembu ukuthi akhethe isitaki esifanele, hhayi nje esisha.

Imfundo yobuchwepheshe isiza amaqembu ukuthi akhethe isitaki esifanele, hhayi nje esisha. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Izinketho ezingcono zobunjiniyela zinciphisa izehlakalo ezinokwethenjelwa ekukhiqizeni.

Izinketho ezingcono zobunjiniyela zinciphisa izehlakalo ezinokwethenjelwa ekukhiqizeni. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Ikusasa Lokufana Kwedatha

Ukufana kwedatha okumsulwa kuya ngokuya kuhlanganiswa nokushadi nokufana kwemodeli kube amasu ayingxube 'ye-nD parallelism' kumamodeli wamapharamitha ayizigidigidi. Lindela ukuminyanisa kwe-gradient okuhlakaniphile, ukuxhumana okuvumelanayo nokugqagqene, kanye ne-topology-aware all-reduction esebenzisa i-NVLink esheshayo ngaphakathi kwenodi kanye ne-InfiniBand ehamba kancane kuwo wonke amanodi. Njengoba amaqoqo akhula, ukwehlisa isilinganiso sokuxhumana-kuya-kukhompyutha kuhlala kuyinselelo emaphakathi yobunjiniyela yokugcina izinkulungwane zama-GPU ematasa.

Ukuqaliswa Komhlaba Wangempela

Ukuqeqesha isigaba sesithombe se-ResNet kuwo wonke ama-GPU angu-8 kuseva eyodwa kusetshenziswa i-PyTorch DistributedDataParallel, i-GPU ngayinye iphatha okungu-32 kwenqwaba yezithombe ezingu-256.

Ukukala ukuqeqeshwa kwangaphambili kwe-BERT kumakhulukhulu ama-GPU nge-Horovod, kusetshenziswa indandatho yokunciphisa konke ukuze uvumelanise ama-gradients isinyathelo ngasinye.

Ukushuna kahle imodeli yokuncoma kuqoqo le-multi-node lapho inodi ngayinye icubungula izingcezwana ezihlukene zokusebenzisana nabasebenzisi.

Kusetshenziswa i-TensorFlow's MirroredStrategy ukusabalalisa ukuqeqeshwa kwemodeli yombono kuwo wonke ama-GPU amaningi endaweni yokusebenza eyodwa enoshintsho oluncane lwekhodi.

Amaphethini Okusebenzisa

Ukufana kwedatha ekusebenzeni

Ukuqeqesha isigaba sesithombe se-ResNet kuwo wonke ama-GPU angu-8 kuseva eyodwa kusetshenziswa i-PyTorch DistributedDataParallel, i-GPU ngayinye iphatha okungu-32 kwenqwaba yezithombe ezingu-256.

Ukuqeqesha isigaba sesithombe se-ResNet kuwo wonke ama-GPU angu-8 kuseva eyodwa kusetshenziswa i-PyTorch DistributedDataParallel, i-GPU ngayinye ephethe ama-32 eqoqo lezithombe ezingama-256 Amathimba ngokuvamile athola imiphumela engcono uma echaza imikhawulo yekhwalithi ngaphambili, agcine indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Ukufana kwedatha ekusebenzeni

Ukukala ukuqeqeshwa kwangaphambili kwe-BERT kumakhulukhulu ama-GPU nge-Horovod, kusetshenziswa indandatho yokunciphisa konke ukuze uvumelanise ama-gradients isinyathelo ngasinye.

Ukukala ukuqeqeshwa kwe-BERT kumakhulu ama-GPU nge-Horovod, kusetshenziswa indandatho yokunciphisa konke ukuze uvumelanise ama-gradients isinyathelo ngasinye Amaqembu ngokuvamile athola imiphumela engcono uma echaza imingcele yekhwalithi ngaphambili, agcine indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Ukufana kwedatha ekusebenzeni

Ukushuna kahle imodeli yokuncoma kuqoqo le-multi-node lapho inodi ngayinye icubungula izingcezwana ezihlukene zokusebenzisana nabasebenzisi.

Ukushuna kahle imodeli yokuncoma kuqoqo elinama-node amaningi lapho inodi ngayinye icubungula ama-shards ahlukene okusebenzelana nabasebenzisi Amaqembu ngokuvamile athola imiphumela engcono uma echaza imingcele yekhwalithi ngaphambili, egcina indlela yokukhuphuka komuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Ukufana kwedatha ekusebenzeni

Kusetshenziswa i-TensorFlow's MirroredStrategy ukusabalalisa ukuqeqeshwa kwemodeli yombono kuwo wonke ama-GPU amaningi endaweni yokusebenza eyodwa enoshintsho oluncane lwekhodi.

Kusetshenziswa i-TensorFlow's MirroredStrategy ukusabalalisa ukuqeqeshwa kwemodeli yombono kuwo wonke ama-GPU amaningi endaweni yokusebenza eyodwa enoshintsho oluncane lwekhodi Amathimba ngokuvamile athola imiphumela engcono uma echaza izinga eliphezulu ngaphambili, agcine indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Izingozi & Guardrails

!

Ukuthuthukisa ibhentshimakhi eyodwa kungafihla ubuthakathaka obubanzi besistimu.

!

Izindleko zengqalasizinda nezokulungisa zivame ukubukelwa phansi.

!

Izikhala zokuphepha nokubonakala zingakhula njengoba izinhlelo ziba nzima kakhulu.

Ukuqalisa Umhlahlandlela

1

Chaza ukubambezeleka, ikhwalithi, nezindleko ezihlosiwe ngaphambi kokuqaliswa.

Chaza ukubambezeleka, ikhwalithi, nezindleko ezihlosiwe ngaphambi kokuqaliswa. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

2

Ibhentshimakhi ngaphansi komthwalo wangempela nezimo zedatha.

Ibhentshimakhi ngaphansi komthwalo wangempela nezimo zedatha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

3

Ukuqapha amathuluzi amaphutha, ukukhukhuleka, nomthelela wabasebenzisi.

Ukuqapha amathuluzi amaphutha, ukukhukhuleka, nomthelela wabasebenzisi. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

4

Lungiselela izindlela zokuhlehlisa nezigameko ngaphambi kokukala.

Lungiselela izindlela zokuhlehlisa nezigameko ngaphambi kokukala. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

Qhubeka Uhlole