Dubawa
Tarin gradient yana ba ku damar kwaikwayi babban girman tsari akan ƙayyadaddun ƙwaƙwalwar GPU ta hanyar tattara gradients akan ƙananan ƙananan batches da yawa kafin sabunta ma'aunin nauyi. Yana da ma'auni na ma'auni don horar da manyan samfura lokacin da ƙwaƙwalwar ajiya ta kasance ƙugiya.
Gradient Accumulation wani shingen gini ne na fasaha wanda ke shafar ingancin samfuri, farashin kayayyakin more rayuwa, jinkiri, da aminci a sikeli.
Zurfafa nutsewa
A al'ada matakin horo yana aiwatar da tsari ɗaya, yana ƙididdige matakan gradients, kuma nan da nan yana sabunta sigogi. Tare da tarin gradient, kuna gudu da yawa gaba da baya akan ƙananan ƙananan batches, kuna ƙara gradients tare a cikin ma'auni, kuma kawai kiran matakin ingantawa (da sifili da gradients) bayan N micro-batches. Matsakaicin girman batch ɗin ya zama lokutan girman ƙaramin ƙaramin batch N, ko da yake mafi girman ƙwaƙwalwar ajiya kawai ta taɓa riƙe ƙaramin ƙarami ɗaya na kunnawa. Wannan yana da mahimmanci saboda yawancin girke-girke na horarwa suna ɗaukar manyan batches don ƙididdige ƙididdiga, kuma saboda samfura kamar manyan taswira ba za su iya dacewa da cikakken tsari na manufa akan na'ura ɗaya ba. Kama: Ana ƙididdige ƙididdige ƙididdiga na al'ada ga kowane ƙaramin batch, don haka ka'ida ko ka'ida ta rukuni sun fi kyau tare da tarawa, kuma dole ne ku daidaita asarar daidai don kiyaye ingantaccen ƙimar koyo daidai.
Fahimtar Fasaha
Saboda gradients na asarar da aka tara suna da ƙari, tara gradients akan N ƙananan batches yana daidai da babban tsari guda ɗaya, idan har kun matsa daidai. Aiwatar da yawanci suna raba kowace asarar micro-batch ta N kafin a koma baya, don haka tarin gradient yayi daidai da ma'ana akan cikakken tsari mai inganci. Kuna tsallake optimizer.step() da zero_grad() har zuwa Nth micro-batch, cinikin ƙarin lokacin ƙididdigewa don rage girman ƙwaƙwalwar ajiya.
Jagorar Tarin Gindi
Tarin gradient yana ba ku damar kwaikwayi babban girman tsari akan ƙayyadaddun ƙwaƙwalwar GPU ta hanyar tattara gradients akan ƙananan ƙananan batches da yawa kafin sabunta ma'aunin nauyi. Yana da ma'auni na ma'auni don horar da manyan samfura lokacin da ƙwaƙwalwar ajiya ta kasance ƙugiya. Gradient Accumulation wani shingen gini ne na fasaha wanda ke shafar ingancin samfuri, farashin kayayyakin more rayuwa, jinkiri, da aminci a sikeli. Don gina zurfin fahimta, bi da Gradient Accumulation azaman ƙirar aiki, ba fasali ɗaya ba: ayyana sakamakon da ake so, fayyace zato, da raba abin da tsarin zai iya yi da dogaro daga abin da har yanzu ke buƙatar yanke hukunci na ƙwararru.
A aikace, ƙaƙƙarfan ƙungiyoyi masu amfani da Gradient Accumulation suna haɓaka gine-gine, bayanai, da zaɓin abubuwan more rayuwa tare da dogaro da farashi. Suna rubuta ƙayyadaddun ƙa'idodin nasara, gwaji akan bayanan gaskiya da gudanawar aiki, da jujjuyawar bisa ga tsarin gazawar da aka lura maimakon cin nasara na lokaci ɗaya. Wannan shine inda fahimtar ka'idar ta juya zuwa iyawa mai dorewa a cikin samfura, manufofi, da ayyuka.
Hukunce-hukuncen gine-gine suna haifar da aiki da tsadar aiki na shekaru. A lokaci guda, Haɓaka ma'auni ɗaya na iya ɓoye manyan raunin tsarin. Hanyar da ta fi dacewa ita ce haɗa saurin gwaji tare da horon gudanarwa: gudanar da matukin jirgi, kama shaida, buga rajistan ayyukan yanke shawara, da ci gaba da sabunta abubuwan tsaro kamar yadda halayen ƙira, tsammanin mai amfani, da buƙatun tsari ke tasowa.
Dabarun Tasiri
Hukunce-hukuncen gine-gine suna haifar da aiki da tsadar aiki na shekaru.
Hukunce-hukuncen gine-gine suna haifar da aiki da tsadar aiki na shekaru. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.
Ilimin fasaha yana taimaka wa ƙungiyoyi su zaɓi tari mai kyau, ba kawai sabon abu ba.
Ilimin fasaha yana taimaka wa ƙungiyoyi su zaɓi tari mai kyau, ba kawai sabon abu ba. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.
Zaɓuɓɓukan injiniya mafi kyau suna rage abin dogaro a cikin samarwa.
Zaɓuɓɓukan injiniya mafi kyau suna rage abin dogaro a cikin samarwa. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.
Aiwatar da Gaskiyar Duniya
Kyakkyawan daidaita babban samfurin harshe akan GPU ɗin mabukaci guda ɗaya ta hanyar tara sama da 8 ko 16 ƙananan batches don isa ingantaccen tsari na ɗaruruwa.
Horar da babban ƙudurin hangen nesa ko nau'ikan rarrabuwa inda ko da batch na 2 ya dace, amma girke-girke yana buƙatar ingantaccen tsari na 32.
Hugging Face Trainer da PyTorch Walƙiya yana fallasa saitin gradient_accumulation_steps da ake amfani da shi akai-akai a cikin iyakantaccen saitin VRAM.
Sake fitar da babban sakamako na takarda akan ƙaramin kayan aiki ta hanyar daidaita girman tsari mai inganci ta hanyar tarawa.
Hanyoyin Aiwatarwa
Tara Gradient a aikace
Kyakkyawan daidaita babban samfurin harshe akan GPU ɗin mabukaci guda ɗaya ta hanyar tara sama da 8 ko 16 ƙananan batches don isa ingantaccen tsari na ɗaruruwa.
Daidaita babban samfurin harshe akan GPU ɗin mabukaci guda ɗaya ta hanyar tara sama da 8 ko 16 micro-batches don isa ingantacciyar tsari na ɗaruruwan Ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don shari'o'in gefe, da kuma bin diddigin nasarorin yawan aiki da ƙimar kuskure akan lokaci.
Tara Gradient a aikace
Horar da babban ƙudurin hangen nesa ko nau'ikan rarrabuwa inda ko da batch na 2 ya dace, amma girke-girke yana buƙatar ingantaccen tsari na 32.
Horar da babban ƙudurin hangen nesa ko nau'ikan rarrabuwa inda har ma da nau'ikan 2 ya dace, amma girke-girke yana buƙatar ingantaccen tsari na ƙungiyoyi 32 yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don shari'o'in gefen, da kuma bin diddigin abubuwan da ake samu da ƙimar kuɗi a kan lokaci.
Tara Gradient a aikace
Hugging Face Trainer da PyTorch Walƙiya yana fallasa saitin gradient_accumulation_steps da ake amfani da shi akai-akai a cikin iyakantaccen saitin VRAM.
Hugging Face Trainer da PyTorch Walƙiya yana fallasa saitin gradient_accumulation_steps da ake amfani da shi akai-akai a cikin ƙayyadaddun saitin VRAM Ƙungiyoyi yawanci suna samun ingantacciyar sakamako lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don shari'o'in gefe, da bin duk nasarorin samarwa da ƙimar kuskure akan lokaci.
Tara Gradient a aikace
Sake fitar da babban sakamako na takarda akan ƙaramin kayan aiki ta hanyar daidaita girman tsari mai inganci ta hanyar tarawa.
Haɓaka babban sakamako na takarda akan ƙaramin kayan aiki ta hanyar dacewa da ingantaccen girman tsari ta hanyar tarawa Ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don ƙararraki, da bin diddigin nasarorin samarwa da ƙimar kuskure akan lokaci.
Hatsari & Tsare-tsare
Haɓaka ma'auni ɗaya na iya ɓoye manyan raunin tsarin.
Sau da yawa ana raina kayan more rayuwa da kuma kuɗin kulawa.
Tsaro da gibin lura na iya girma yayin da tsarin ke ƙara haɓaka.
Taswirar Hanya
Ƙayyade latency, inganci, da maƙasudin farashi kafin aiwatarwa.
Ƙayyade latency, inganci, da maƙasudin farashi kafin aiwatarwa. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.
Alamar ma'auni a ƙarƙashin ainihin kaya da yanayin bayanai.
Alamar ma'auni a ƙarƙashin ainihin kaya da yanayin bayanai. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.
Kula da kayan aiki don kurakurai, ɗigo, da tasirin mai amfani.
Kula da kayan aiki don kurakurai, ɗigo, da tasirin mai amfani. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.
Shirya bijirowa da hanyoyin mayar da martani kafin sikeli.
Shirya bijirowa da hanyoyin mayar da martani kafin sikeli. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.