Jagorar Fasaha

Kasuwancin sake lissafin kunnawa

Ƙididdigar kunnawa (tambarin dubawa ko kunnawa) yana adana ƙwaƙwalwar GPU yayin horo ta hanyar watsar da kunnawa na tsaka-tsaki a cikin wucewar gaba da sake yin lissafin su yayin wucewar baya.

Dubawa

Ƙididdigar kunnawa (tambarin dubawa ko kunnawa) yana adana ƙwaƙwalwar GPU yayin horo ta hanyar watsar da kunnawa na tsaka-tsaki a cikin wucewar gaba da sake yin lissafin su yayin wucewar baya. Yana cinikin ƙarin ƙididdigewa don ikon horar da samfura masu girma ko tsayin jeri akan kayan masarufi iri ɗaya.

Sake ƙididdige kunnawa Kasuwancin fasaha ne na gini wanda ke shafar ingancin ƙira, farashin kayayyakin more rayuwa, latency, da aminci a sikeli.

Zurfafa nutsewa

Bayar da baya yana buƙatar kunnawa-wuta don ƙididdige gradients, don haka ta tsohuwa ana adana kayan aikin kowane Layer - babban farashin ƙwaƙwalwar ajiya wanda ke girma tare da girman ƙirar, girman tsari, da tsayin jeri. Ƙididdigar kunnawa tana adana ƴan ƴan tantanin 'checkpoint' (sau da yawa kawai iyakoki) kuma yana watsar da sauran. A lokacin wucewar baya, yana sake gudanar da lissafin gaba tsakanin wuraren bincike don sabunta abubuwan da aka jefar akan buƙata. Sakamakon al'ada shi ne cewa tare da wuraren bincike da aka sanya kowane yadudduka na sqrt(N), ƙwaƙwalwar ajiya tana raguwa zuwa kusan O(sqrt (N)) yayin ƙara kusan ƙarin wucewa ɗaya na gaba (~ 33% ƙarin ƙididdigewa). Zaɓaɓɓun bambance-bambancen suna sake ƙididdige abubuwa masu arha-amma-ƙwaƙwalwa-nauyi (kamar hankali ko faduwa) yayin da ake adana masu tsada, samun mafi yawan ajiyar ƙwaƙwalwar ajiya don ƙarancin ƙididdigewa sama da ƙasa.

Fahimtar Fasaha

Babban ciniki shine ƙwaƙwalwar ajiya tare da FLOPs. Cikakken ƙididdigewa yana ƙara ƙarin wucewa gaba ɗaya kowane mataki (~ 30-40% a hankali) amma yana iya yanke ƙwaƙwalwar kunnawa ta tsari mai girma. Ɗauki mai wayo shine zaɓin dubawa: gano ops waɗanda ke da girman ƙwaƙwalwar ajiya amma ƙididdige-arha (softmax, Layernorm, GELU, ƙimar kulawa) kuma a sake lissafin waɗancan kawai, yayin da ake adana sakamakon GEMM masu tsada - rage girman ƙididdigewa.

Jagorar Sake Ƙididdigar Kunnawa

Ƙididdigar kunnawa (tambarin dubawa ko kunnawa) yana adana ƙwaƙwalwar GPU yayin horo ta hanyar watsar da kunnawa na tsaka-tsaki a cikin wucewar gaba da sake yin lissafin su yayin wucewar baya. Yana cinikin ƙarin ƙididdigewa don ikon horar da samfura masu girma ko tsayin jeri akan kayan masarufi iri ɗaya. Sake ƙididdige kunnawa Kasuwancin fasaha ne na gini wanda ke shafar ingancin ƙira, farashin kayayyakin more rayuwa, latency, da aminci a sikeli. Don gina zurfin fahimta, kula da Kasuwancin Sake lissafin Kunnawa azaman samfurin aiki, ba fasali ɗaya ba: ayyana sakamakon da ake so, fayyace zato, da raba abin da tsarin zai iya yi da dogaro daga abin da har yanzu yana buƙatar yanke hukunci na ƙwararru.

A aikace, ƙungiyoyi masu ƙarfi da ke amfani da Kasuwancin Sake lissafin Kunnawa suna haɓaka gine-gine, bayanai, da zaɓin abubuwan more rayuwa a kan dogaro da farashi. Suna rubuta ƙayyadaddun ƙa'idodin nasara, gwaji akan bayanan gaskiya da gudanawar aiki, da jujjuyawar bisa ga tsarin gazawar da aka lura maimakon cin nasara na lokaci ɗaya. Wannan shine inda fahimtar ka'idar ta juya zuwa iyawa mai dorewa a cikin samfura, manufofi, da ayyuka.

Hukunce-hukuncen gine-gine suna haifar da aiki da tsadar aiki na shekaru. A lokaci guda, Haɓaka ma'auni ɗaya na iya ɓoye manyan raunin tsarin. Hanyar da ta fi dacewa ita ce haɗa saurin gwaji tare da horon gudanarwa: gudanar da matukin jirgi, kama shaida, buga rajistan ayyukan yanke shawara, da ci gaba da sabunta abubuwan tsaro kamar yadda halayen ƙira, tsammanin mai amfani, da buƙatun tsari ke tasowa.

Dabarun Tasiri

Hukunce-hukuncen gine-gine suna haifar da aiki da tsadar aiki na shekaru.

Hukunce-hukuncen gine-gine suna haifar da aiki da tsadar aiki na shekaru. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.

Ilimin fasaha yana taimaka wa ƙungiyoyi su zaɓi tari mai kyau, ba kawai sabon abu ba.

Ilimin fasaha yana taimaka wa ƙungiyoyi su zaɓi tari mai kyau, ba kawai sabon abu ba. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.

Zaɓuɓɓukan injiniya mafi kyau suna rage abin dogaro a cikin samarwa.

Zaɓuɓɓukan injiniya mafi kyau suna rage abin dogaro a cikin samarwa. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.

Makomar Canjin Sake lissafin Kunnawa

Sake lissafin yana ƙara sarrafa kansa da zaɓi. Tsarukan yanzu suna yin bayanin ƙwaƙwalwar ajiyar kowane op da farashin FLOP don zaɓar wuraren bincike mafi kyau, da haɗa ƙididdiga tare da ƙaddamar da kunnawa zuwa CPU/NVMe kuma tare da dabarun daidaitawa. Yayin da tsayin mahallin da girman samfurin ke ci gaba da girma, sa ran manufofin masu tarawa (a cikin PyTorch, JAX/XLA) waɗanda ke ɗaukar yanke shawara na kowane-op ta atomatik, tare da ƙarin juzu'i na sake ƙididdigewa tare da sadarwa don haka an ɓoye ƙarin FLOPs.

Aiwatar da Gaskiyar Duniya

Horar da babban taswira wanda ba zai dace ba ta hanyar duba kowane shingen Layer

Yin amfani da PyTorch's torch.utils.checkpoint don nannade tubalan wuta da yanke ƙwaƙwalwar kunnawa

Zaɓin sake lissafin hankali/softmax a cikin Megatron-LM don adana ƙwaƙwalwar ajiya tare da raguwa kaɗan

Bayar da tsayin jeri akan ƙayyadaddun kasafin kudin GPU ta hanyar sake lissafin kunnawa maimakon adana su.

Hanyoyin Aiwatarwa

Sake lissafin kunnawa Kasuwanci a aikace

Horar da babban taswira wanda ba zai dace ba ta hanyar duba kowane shingen Layer.

Horar da babban taswira wanda ba zai dace ba ta hanyar duba kowane shingen shinge Ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ƙofofin inganci a gaba, kiyaye hanyar haɓakar ɗan adam don ƙararraki, da bin diddigin nasarorin samarwa da tsadar kuskure akan lokaci.

Sake lissafin kunnawa Kasuwanci a aikace

Yin amfani da PyTorch's torch.utils.checkpoint don nannade tubalan wuta da yanke ƙwaƙwalwar kunnawa.

Yin amfani da PyTorch's torch.utils.checkpoint don nannade tubalan wutan lantarki da yanke ƙwaƙwalwar kunna kunnawa Ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don ƙararraki, da bin diddigin nasarorin samarwa da ƙimar kuskure akan lokaci.

Sake lissafin kunnawa Kasuwanci a aikace

Zaɓin sake lissafin hankali/softmax a cikin Megatron-LM don adana ƙwaƙwalwar ajiya tare da raguwa kaɗan.

Zaɓin sake lissafin hankali / softmax a cikin Megatron-LM don adana ƙwaƙwalwar ajiya tare da ƙarancin ragewa Ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don ƙararrakin gefe, da kuma bin diddigin nasarorin samarwa da ƙimar kuskure akan lokaci.

Sake lissafin kunnawa Kasuwanci a aikace

Bayar da tsayin jeri akan ƙayyadadden kasafin kudin GPU ta hanyar sake lissafin kunnawa maimakon adana su.

Bayar da tsayin tsayin jeri akan ƙayyadaddun kasafin GPU ta hanyar sake lissafin kunnawa maimakon adana su Ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ƙofofin inganci a gaba, kiyaye hanyar haɓakar ɗan adam don shari'o'in gefe, da bin duk nasarorin samarwa da ƙimar kuskure akan lokaci.

Hatsari & Tsare-tsare

!

Haɓaka ma'auni ɗaya na iya ɓoye manyan raunin tsarin.

!

Sau da yawa ana raina kayan more rayuwa da kuma kuɗin kulawa.

!

Tsaro da gibin lura na iya girma yayin da tsarin ke ƙara haɓaka.

Taswirar Hanya

1

Ƙayyade latency, inganci, da maƙasudin farashi kafin aiwatarwa.

Ƙayyade latency, inganci, da maƙasudin farashi kafin aiwatarwa. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.

2

Alamar ma'auni a ƙarƙashin ainihin kaya da yanayin bayanai.

Alamar ma'auni a ƙarƙashin ainihin kaya da yanayin bayanai. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.

3

Kula da kayan aiki don kurakurai, ɗigo, da tasirin mai amfani.

Kula da kayan aiki don kurakurai, ɗigo, da tasirin mai amfani. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.

4

Shirya bijirowa da hanyoyin mayar da martani kafin sikeli.

Shirya bijirowa da hanyoyin mayar da martani kafin sikeli. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.

Ci gaba da Bincike