Dubawa
Daidaitawar jeri yana raba jerin dogon shigarwa guda ɗaya a cikin GPUs da yawa tare da girman alamar (lokaci), kuma Hankalin Ring yana barin waɗancan GPUs su lissafta ainihin hankali ta hanyar wucewar maɓalli / ƙimar ƙima a kusa da zobe. Tare suna sanya windows mahallin alamar alama miliyan mai yuwuwa ba tare da GPU guda ɗaya da ke riƙe da gaba ɗaya ba.
Daidaitawar Jeri da Hankalin zobe babban shingen gini ne na fasaha wanda ke shafar ingancin ƙira, farashin kayayyakin more rayuwa, jinkiri, da aminci a sikeli.
Zurfafa nutsewa
Daidaitaccen hankali yana buƙatar kowace tambaya don ganin kowane maɓalli/daraja, don haka ƙwaƙwalwar kunnawa tana girma tare da tsayin jeri kuma cikakken K/V dole ne ya kasance. Daidaitawar jeri yana ɓarna jeri don haka kowane GPU ya mallaki ɗimbin alamomi (da tambayoyinsu, maɓallai, ƙima). Hankalin ringi sannan yana tsara GPUs a cikin zobe mai ma'ana: kowace na'ura tana kiyaye tambayoyin gida yayin da K/V ke wucewa hop-by-hop a kusa da zoben. Yayin da kowane toshe ya zo, GPU yana ƙididdige hankali kuma yana tara sakamako ta amfani da kan layi-softmax (daidaitaccen max/ jimla dabara kamar FlashAttention). Bayan cikakken madauki, kowace tambaya ta halarci kowane maɓalli daidai, ba tare da GPU da ya taɓa adana duka K/V ba. Mahimmanci, sadarwar K/V ta zo tare da ƙididdigewa, don haka yana ƙara ƙaramin farashin agogon bango.
Fahimtar Fasaha
Hankalin zobe ya dogara da softmax na kan layi: ana iya ƙididdige hankali kan toshe-toshe yayin kiyaye iyakar gudu da mai daidaita al'ada, sannan sake ƙididdige jimlar juzu'i na farko lokacin da ƙimar girma ta bayyana. Wannan ya sa sakamakon ya zama daidai da lissafi ga cikakken hankali. Zoben yana wucewa kawai K / V tenors (ma'auni mai girma tare da toshe, ba cikakken jerin ba), kuma saboda kowane sadarwar hop ya mamaye matmul block na baya, bandwidth - ba ƙwaƙwalwar ajiya ba - ya zama abin iyakancewa.
Jagorar Tsarin Daidaici da Hankalin zobe
Daidaitawar jeri yana raba jerin dogon shigarwa guda ɗaya a cikin GPUs da yawa tare da girman alamar (lokaci), kuma Hankalin Ring yana barin waɗancan GPUs su lissafta ainihin hankali ta hanyar wucewar maɓalli / ƙimar ƙima a kusa da zobe. Tare suna sanya windows mahallin alamar alama miliyan mai yuwuwa ba tare da GPU guda ɗaya da ke riƙe da gaba ɗaya ba. Daidaitawar Jeri da Hankalin zobe babban shingen gini ne na fasaha wanda ke shafar ingancin ƙira, farashin kayayyakin more rayuwa, jinkiri, da aminci a sikeli. Don haɓaka fahimta mai zurfi, bi da Daidaitawar Jeri da Hankalin zobe azaman ƙirar aiki, ba sifa ɗaya ba: ayyana sakamakon da ake so, fayyace zato, da raba abin da tsarin zai iya yi da dogaro daga abin da har yanzu yana buƙatar yanke hukunci na ƙwararru.
A aikace, ƙungiyoyi masu ƙarfi waɗanda ke amfani da Daidaitawar Jeri da Hankalin zobe suna haɓaka gine-gine, bayanai, da zaɓin abubuwan more rayuwa tare da dogaro da farashi. Suna rubuta ƙayyadaddun ƙa'idodin nasara, gwaji akan bayanan gaskiya da gudanawar aiki, da jujjuyawar bisa ga tsarin gazawar da aka lura maimakon cin nasara na lokaci ɗaya. Wannan shine inda fahimtar ka'idar ta juya zuwa iyawa mai dorewa a cikin samfura, manufofi, da ayyuka.
Hukunce-hukuncen gine-gine suna haifar da aiki da tsadar aiki na shekaru. A lokaci guda, Haɓaka ma'auni ɗaya na iya ɓoye manyan raunin tsarin. Hanyar da ta fi dacewa ita ce haɗa saurin gwaji tare da horon gudanarwa: gudanar da matukin jirgi, kama shaida, buga rajistan ayyukan yanke shawara, da ci gaba da sabunta abubuwan tsaro kamar yadda halayen ƙira, tsammanin mai amfani, da buƙatun tsari ke tasowa.
Dabarun Tasiri
Hukunce-hukuncen gine-gine suna haifar da aiki da tsadar aiki na shekaru.
Hukunce-hukuncen gine-gine suna haifar da aiki da tsadar aiki na shekaru. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.
Ilimin fasaha yana taimaka wa ƙungiyoyi su zaɓi tari mai kyau, ba kawai sabon abu ba.
Ilimin fasaha yana taimaka wa ƙungiyoyi su zaɓi tari mai kyau, ba kawai sabon abu ba. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.
Zaɓuɓɓukan injiniya mafi kyau suna rage abin dogaro a cikin samarwa.
Zaɓuɓɓukan injiniya mafi kyau suna rage abin dogaro a cikin samarwa. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.
Aiwatar da Gaskiyar Duniya
Horar da mahallin 1M-alama ta LLM ta hanyar rarraba kowane jeri a cikin GPUs 8 tare da Hangen Ring
Tsarin layin Megatron-LM yana rage ƙwaƙwalwar kunnawa a cikin LayerNorm da yankuna masu fita
Sarrafa gabaɗayan littafi ko babban ma'ajiyar lamba a wucewa ta gaba ɗaya ba tare da yanke ba
Haɗa Hankalin Ring tare da daidaitawar tensor don dacewa da ra'ayi mai tsayi mai tsayi akan kumburin GPU da yawa.
Hanyoyin Aiwatarwa
Jeri Daidaito da Hankalin zobe a aikace
Horar da mahallin 1M-alama ta LLM ta hanyar rarraba kowane jeri a cikin GPUs 8 tare da Hannun zobe.
Horar da mahallin 1M-alama LLM ta hanyar rarraba kowane jeri a cikin 8 GPUs tare da Ƙungiyoyin Hankali na Ring yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don shari'o'in gefe, da bin duk nasarorin samarwa da ƙimar kuskure akan lokaci.
Jeri Daidaito da Hankalin zobe a aikace
Tsarin layin Megatron-LM yana rage ƙwaƙwalwar kunnawa a cikin LayerNorm da yankuna masu fita.
Tsarin layi na Megatron-LM yana rage ƙwaƙwalwar kunnawa a cikin LayerNorm da yankunan da aka cirewa Ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don ƙararrakin ƙira, da bin duk nasarorin samarwa da ƙimar kuskure akan lokaci.
Jeri Daidaito da Hankalin zobe a aikace
Sarrafa gabaɗayan littafi ko babban ma'ajiyar lamba a wucewa ta gaba ɗaya ba tare da yanke ba.
Sarrafa duka littafi ko babban ma'ajiyar lambar a cikin wucewar gaba ɗaya ba tare da yanke ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don shari'o'in gefe, da bin duk nasarorin samarwa da ƙimar kuskure akan lokaci.
Jeri Daidaito da Hankalin zobe a aikace
Haɗa Hankalin zobe tare da daidaitawar tensor don dacewa da ra'ayi mai tsayi mai tsayi akan kumburin GPU da yawa.
Haɗa Hankalin Ring tare da daidaitawar tensor don dacewa da madaidaicin yanayi na dogon lokaci akan kullin GPU da yawa Ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don shari'o'in gefe, da bin duk nasarorin samarwa da ƙimar kuskure akan lokaci.
Hatsari & Tsare-tsare
Haɓaka ma'auni ɗaya na iya ɓoye manyan raunin tsarin.
Sau da yawa ana raina kayan more rayuwa da kuma kuɗin kulawa.
Tsaro da gibin lura na iya girma yayin da tsarin ke ƙara haɓaka.
Taswirar Hanya
Ƙayyade latency, inganci, da maƙasudin farashi kafin aiwatarwa.
Ƙayyade latency, inganci, da maƙasudin farashi kafin aiwatarwa. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.
Alamar ma'auni a ƙarƙashin ainihin kaya da yanayin bayanai.
Alamar ma'auni a ƙarƙashin ainihin kaya da yanayin bayanai. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.
Kula da kayan aiki don kurakurai, ɗigo, da tasirin mai amfani.
Kula da kayan aiki don kurakurai, ɗigo, da tasirin mai amfani. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.
Shirya bijirowa da hanyoyin mayar da martani kafin sikeli.
Shirya bijirowa da hanyoyin mayar da martani kafin sikeli. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.