Jagorar Fasaha

Ƙididdigar Samfura

Ƙididdigar ƙirar ƙididdigewa yana rage hanyar sadarwa ta jijiyoyi ta hanyar adana lambobinsa a cikin ƴan ragi, don haka samfurin iri ɗaya yana aiki da sauri kuma akan ƙaramin kayan aiki.

Dubawa

Ƙididdigar ƙirar ƙididdigewa yana rage hanyar sadarwa ta jijiyoyi ta hanyar adana lambobinsa a cikin ƴan ragi, don haka samfurin iri ɗaya yana aiki da sauri kuma akan ƙaramin kayan aiki. Wannan shine babban dalilin da yasa manyan samfura zasu iya dacewa akan GPU guda ɗaya, kwamfutar tafi-da-gidanka, ko ma waya.

Ƙididdigar ƙirar ƙira wani shingen ginin fasaha ne wanda ke shafar ingancin samfuri, farashin kayayyakin more rayuwa, jinkiri, da aminci a sikeli.

Zurfafa nutsewa

Samfuran da aka horar galibi suna adana kowane nauyi azaman lamba 32-bit ko 16-bit mai iyo. Ƙididdigewa yana maye gurbin waɗanda ke da ƙananan tsari kamar 8-bit integers (INT8) ko 4-bit dabi'u (INT4), yankan ƙwaƙwalwar ajiya kusan 4x zuwa 8x. Samfurin siga-biliyan 70 wanda ke buƙatar kusan 140GB a cikin 16-bit na iya faɗuwa kusa da 35GB a 4-bit, wanda ya dace da GPU mai amfani ɗaya. Kama shine daidaito: matsi da ƙima iri-iri cikin bukiti 256 ko 16 ya rasa dalla-dalla. Hanyoyi na zamani kamar GPTQ, AWQ, da tsarin NF4 da aka yi amfani da su a cikin QLoRA suna ɗaukar abubuwan sikeli mai wayo da kuma kare mafi yawan ma'aunin nauyi, don haka asarar inganci sau da yawa kanana. Ƙididdigewa shine dalilin da ya sa kayan aikin kamar llama.cpp da Ollama zasu iya gudanar da samfura masu inganci a cikin gida ba tare da cibiyar bayanai ba.

Fahimtar Fasaha

Ƙididdige taswirorin ƙididdige ƙididdiga na gaske zuwa ƙaramin grid ɗin lamba ta amfani da sikeli da sifili-point: storage_int = zagaye(darajar / sikelin) + zero_point. Zaɓin ma'auni da kyau shine duka wasan. Sikelin kowane tashoshi ko kowane rukuni yana kiyaye ma'auni daban-daban don yankan matrix nauyi, yana kiyaye daidaiton inda yake da mahimmanci. Ƙididdigar horarwar bayan horo kawai tana canza ƙirar da aka gama, yayin da horarwar ƙididdigewa yana kwatanta zagaye yayin horo don hanyar sadarwar ta koyi jure shi, yawanci tana ba da mafi ƙarancin ƙarancin daidaito.

Ƙididdigar Ƙirar Samfura

Ƙididdigar ƙirar ƙididdigewa yana rage hanyar sadarwa ta jijiyoyi ta hanyar adana lambobinsa a cikin ƴan ragi, don haka samfurin iri ɗaya yana aiki da sauri kuma akan ƙaramin kayan aiki. Wannan shine babban dalilin da yasa manyan samfura zasu iya dacewa akan GPU guda ɗaya, kwamfutar tafi-da-gidanka, ko ma waya. Ƙididdigar ƙirar ƙira wani shingen ginin fasaha ne wanda ke shafar ingancin samfuri, farashin kayayyakin more rayuwa, jinkiri, da aminci a sikeli. Don gina zurfin fahimta, bi da ƙididdige ƙididdiga a matsayin ƙirar aiki, ba fasali ɗaya ba: ayyana sakamakon da ake so, fayyace zato, da raba abin da tsarin zai iya yi da dogaro daga abin da har yanzu yana buƙatar yanke hukunci na ƙwararru.

A aikace, ƙungiyoyi masu ƙarfi masu amfani da ƙididdige ƙididdigewa suna haɓaka gine-gine, bayanai, da zaɓin abubuwan more rayuwa tare da dogaro da farashi. Suna rubuta ƙayyadaddun ƙa'idodin nasara, gwaji akan bayanan gaskiya da gudanawar aiki, da jujjuyawar bisa ga tsarin gazawar da aka lura maimakon cin nasara na lokaci ɗaya. Wannan shine inda fahimtar ka'idar ta juya zuwa iyawa mai dorewa a cikin samfura, manufofi, da ayyuka.

Hukunce-hukuncen gine-gine suna haifar da aiki da tsadar aiki na shekaru. A lokaci guda, Haɓaka ma'auni ɗaya na iya ɓoye manyan raunin tsarin. Hanyar da ta fi dacewa ita ce haɗa saurin gwaji tare da horon gudanarwa: gudanar da matukin jirgi, kama shaida, buga rajistan ayyukan yanke shawara, da ci gaba da sabunta abubuwan tsaro kamar yadda halayen ƙira, tsammanin mai amfani, da buƙatun tsari ke tasowa.

Dabarun Tasiri

Hukunce-hukuncen gine-gine suna haifar da aiki da tsadar aiki na shekaru.

Hukunce-hukuncen gine-gine suna haifar da aiki da tsadar aiki na shekaru. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.

Ilimin fasaha yana taimaka wa ƙungiyoyi su zaɓi tari mai kyau, ba kawai sabon abu ba.

Ilimin fasaha yana taimaka wa ƙungiyoyi su zaɓi tari mai kyau, ba kawai sabon abu ba. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.

Zaɓuɓɓukan injiniya mafi kyau suna rage abin dogaro a cikin samarwa.

Zaɓuɓɓukan injiniya mafi kyau suna rage abin dogaro a cikin samarwa. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.

Makomar Ƙididdigar Samfura

Yi tsammanin daidaici-ƙananan koyaushe ya zama al'ada. Bincike yana tura amintaccen 4-bit, 2-bit, har ma da ma'aunin ma'auni na binary, tare da madaidaitan tsare-tsare waɗanda ke kiyaye yadudduka masu mahimmanci. Hardware yana biye: GPUs da guntuwar waya yanzu sun haɗa da na asali INT8, INT4, da FP8 math units. Siffofin kamar FP8 da MXFP4 suna nufin haɗa kewayon masu iyo tare da girman lamba. Haɗe da fasahohi kamar QLoRA, ƙididdigewa zai ci gaba da yin ƙirar iyaka mai rahusa don aiki da daidaitawa akan na'urorin yau da kullun.

Aiwatar da Gaskiyar Duniya

Gudun samfurin Llama 7B ko 13B akan kwamfutar tafi-da-gidanka tare da llama.cpp ko Ollama ta amfani da fayilolin GGUF 4-bit.

QLoRA yana daidaita babban samfuri akan GPU ɗaya ta hanyar kiyaye ma'aunin tushe a daskare a cikin 4-bit NF4.

Aiwatar da ƙirar INT8 akan wayoyi tare da lokutan aiki akan na'urar don haka mataimaka suna aiki a layi da kuma cikin sirri.

Yin hidimar ƙarshen ƙarshen API mai rahusa inda INT8/FP8 ƙididdigewa kusan ninki biyu kayan aiki da yanke farashin ƙwaƙwalwar ajiya.

Hanyoyin Aiwatarwa

Ƙididdigar Samfura a aikace

Gudun samfurin Llama 7B ko 13B akan kwamfutar tafi-da-gidanka tare da llama.cpp ko Ollama ta amfani da fayilolin GGUF 4-bit.

Gudun samfurin 7B ko 13B Llama akan kwamfutar tafi-da-gidanka tare da llama.cpp ko Ollama ta amfani da fayilolin GGUF 4-bit Ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don ƙararraki, da kuma bin diddigin nasarorin samarwa da ƙimar kuskure akan lokaci.

Ƙididdigar Samfura a aikace

QLoRA yana daidaita babban samfuri akan GPU ɗaya ta hanyar kiyaye ma'aunin tushe a daskare a cikin 4-bit NF4.

QLoRA mai kyau-daidaita babban samfuri akan GPU guda ɗaya ta hanyar kiyaye ma'aunin tushe a cikin 4-bit NF4 Ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don shari'o'in gefe, da kuma bin diddigin nasarorin yawan aiki da ƙimar kuskure akan lokaci.

Ƙididdigar Samfura a aikace

Aiwatar da ƙirar INT8 akan wayoyi tare da lokutan aiki akan na'urar don haka mataimaka suna aiki a layi da kuma cikin sirri.

Aiwatar da samfuran INT8 akan wayoyi tare da lokutan aiki akan na'urar don haka mataimaka suna aiki a layi kuma ƙungiyoyin keɓaɓɓu yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓaka ɗan adam don ƙararraki, da bin duk nasarorin samarwa da ƙimar kuskure akan lokaci.

Ƙididdigar Samfura a aikace

Yin hidimar ƙarshen ƙarshen API mai rahusa inda INT8/FP8 ƙididdigewa kusan ninki biyu kayan aiki da yanke farashin ƙwaƙwalwar ajiya.

Yin hidimar ƙarshen ƙarshen API mai rahusa inda INT8/FP8 ƙididdigewa kusan ninki biyu kayan aiki da yanke farashin ƙwaƙwalwar ajiya Ƙungiyoyi yawanci suna samun kyakkyawan sakamako lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don shari'o'in gefe, da bin duk nasarorin samarwa da farashi na kuskure akan lokaci.

Hatsari & Tsare-tsare

!

Haɓaka ma'auni ɗaya na iya ɓoye manyan raunin tsarin.

!

Sau da yawa ana raina kayan more rayuwa da kuma kuɗin kulawa.

!

Tsaro da gibin lura na iya girma yayin da tsarin ke ƙara haɓaka.

Taswirar Hanya

1

Ƙayyade latency, inganci, da maƙasudin farashi kafin aiwatarwa.

Ƙayyade latency, inganci, da maƙasudin farashi kafin aiwatarwa. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.

2

Alamar ma'auni a ƙarƙashin ainihin kaya da yanayin bayanai.

Alamar ma'auni a ƙarƙashin ainihin kaya da yanayin bayanai. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.

3

Kula da kayan aiki don kurakurai, ɗigo, da tasirin mai amfani.

Kula da kayan aiki don kurakurai, ɗigo, da tasirin mai amfani. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.

4

Shirya bijirowa da hanyoyin mayar da martani kafin sikeli.

Shirya bijirowa da hanyoyin mayar da martani kafin sikeli. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.

Ci gaba da Bincike