Okuyisisekelo UMHLAHLANDLELA

I-Chinchilla Compute-Optimal Training

I-Chinchilla iwuhlelo lwango-2022 lwe-DeepMind oluthola ukuthi amamodeli amaningi ezilimi ayengafundelwanga kahle kakhulu: kubhajethi yekhompiyutha engaguquki kufanele ukale amapharamitha nedatha cishe ngokulinganayo, ungagcini nje ukwakha imodeli enkulu.

Uhlolojikelele

I-Chinchilla iwuhlelo lwango-2022 lwe-DeepMind oluthola ukuthi amamodeli amaningi ezilimi ayengafundelwanga kahle kakhulu: kubhajethi yekhompiyutha engaguquki kufanele ukale amapharamitha nedatha cishe ngokulinganayo, ungagcini nje ukwakha imodeli enkulu. Ilungise kabusha ukuthi imboni ibhalansisa kanjani usayizi wemodeli ngokumelene nedatha yokuqeqeshwa.

I-Chinchilla Compute-Optimal Training ihlezi kukhithi yamathuluzi eyinhloko ye-AI. Uma uyiqonda, ezinye izihloko ze-AI ziba lula ukuzihlola nokuqhathanisa.

I-Deep Dive

Iphepha le-DeepMind's Chinchilla liphinde lavakashela ukukala futhi laqeqeshelwa amamodeli angaphezu kuka-400 ukuze lithole ibhalansi yekhompyutha elungile. Umthetho wesihloko sesithupha: usayizi wemodeli namathokheni okuqeqesha kufanele akhule ku-lockstep, cishe amathokheni okuqeqesha angama-20 ngepharamitha ngayinye. Ukufakazela lokho, baqeqeshe i-Chinchilla, imodeli ye-70-billion-parameter kumathokheni we-1.4 trillion, besebenzisa ikhompuyutha efanayo ne-280-billion-parameter Gopher eqeqeshwe ngamathokheni ambalwa kakhulu. I-Chinchilla, naphezu kokuba yincane ngokuphindwe kane, i-Gopher, i-GPT-3, namanye ama-giants yaphumelela cishe kuwo wonke ama-benchmark. Isifundo siguqule isiphetho sangaphambili OpenAI sokuthi usayizi othandwayo kunedatha, okubonisa amamodeli amaningi aphambili ayeshiya ukusebenza etafuleni ngokuba makhulu kakhulu futhi abulawa yidatha kakhulu.

I-Technical Insight

I-Chinchilla fit loss as L(N,D) = E + A·N^(-α) + B·D^(-β), no-α kanye no-β kokubili eduze kuka-0.34, okusho ukuthi amapharamitha nedatha kunikela cishe ngokulinganayo. Ukuthuthukisa lokhu ngaphansi kwesithiyo sekhompiyutha esingaguquki (bale ≈ 6·N·D kuma-transformer) kunikeza umphumela wokulinganisa okulinganayo. Imodeli encane, enothile ngedatha nayo ishibhile ukuyisebenzisa ngokuqonda, ngakho-ke inzuzo yayo ihlanganisa ukuthunyelwa, hhayi nje ukuqeqeshwa.

I-Mastering Chinchilla Compute-Optimal Training

I-Chinchilla iwuhlelo lwango-2022 lwe-DeepMind oluthola ukuthi amamodeli amaningi ezilimi ayengafundelwanga kahle kakhulu: kubhajethi yekhompiyutha engaguquki kufanele ukale amapharamitha nedatha cishe ngokulinganayo, ungagcini nje ukwakha imodeli enkulu. Ilungise kabusha ukuthi imboni ibhalansisa kanjani usayizi wemodeli ngokumelene nedatha yokuqeqeshwa. I-Chinchilla Compute-Optimal Training ihlezi kukhithi yamathuluzi eyinhloko ye-AI. Uma uyiqonda, ezinye izihloko ze-AI ziba lula ukuzihlola nokuqhathanisa. Ukuze wakhe ukuqonda okujulile, phatha i-Chinchilla Compute-Optimal Training njengemodeli yokusebenza, hhayi isici esisodwa: chaza imiphumela oyifunayo, ucacise ukucabanga, futhi uhlukanise lokho uhlelo olungakwenza ngokwethembeka kulokho okusadinga ukwahlulela kochwepheshe.

Empeleni, amaqembu aqinile asebenzisa i-Chinchilla Compute-Optimal Training akha amamodeli emicabango aqinile kuqala, abese enza imephu lawo mamodeli abe yizingqinamba zokukhiqiza zangempela. Babhala imibandela yempumelelo ecacile, ukuhlola okuqhathaniswa nedatha engokoqobo nokugeleza komsebenzi, futhi baphindaphinde ngokusekelwe kumaphethini okuhluleka aqashiwe esikhundleni sokuwina kwebhentshimakhi yesikhathi esisodwa. Yilapho ukuqonda kwethiyori kuguquka kube amandla ahlala njalo kuwo wonke umkhiqizo, inqubomgomo, kanye nokusebenza.

Kukusiza ukuthi uhlukanise izimangalo ezicacile zobuchwepheshe kusukela olimini lokumaketha. Ngesikhathi esifanayo, amaqembu ahlukene angasebenzisa igama elifanayo ngokuhlukile, ngakho chaza ububanzi kusenesikhathi. Indlela eqine kakhulu iwukuhlanganisa isivinini sokuhlola nesiyalo sokuphatha: qhuba abashayeli bezindiza, bamba ubufakazi, ushicilele amalogi ezinqumo, futhi ubuyekeze izivikelo ngokuqhubekayo njengoba imodeli yokuziphatha, okulindelwe ngabasebenzisi, kanye nezimfuneko zokulawula zishintsha.

I-Strategic Impact

Kukusiza ukuthi uhlukanise izimangalo ezicacile zobuchwepheshe kusukela olimini lokumaketha.

Kukusiza ukuthi uhlukanise izimangalo ezicacile zobuchwepheshe kusukela olimini lokumaketha. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Ungabuza imibuzo yokusebenzisa kangcono ngaphambi kokusebenzisa imali noma isikhathi.

Ungabuza imibuzo yokusebenzisa kangcono ngaphambi kokusebenzisa imali noma isikhathi. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Amaqembu anokuqonda okwabiwe enza izinqumo ezingcono zomkhiqizo, inqubomgomo, nokufunda.

Amaqembu anokuqonda okwabiwe enza izinqumo ezingcono zomkhiqizo, inqubomgomo, nokufunda. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Ikusasa Le-Chinchilla Compute-Optimal Training

Amamodeli esimanje afana ne-Llama 3 aphusha ngamabomu isilinganiso samathokheni angama-20 ngepharamitha ye-Chinchilla, eqeqesha amamodeli amancane kumathriliyoni amathokheni ukuze enze inkambiso eshibhile, amukele ikhompuyutha yokuqeqeshwa ephansi. Njengoba idatha enhle iyancipha, intshisekelo iyakhula kuma-epoch aphindaphindiwe, idatha yokwenziwa, nokuhlunga kwekhwalithi. I-Chinchilla isalokhu iyiphoyinti eliyinkomba, kodwa okuhle kuya ngokuya ngokuya kuncika ezindlekweni zokuphila konke, hhayi nje isabelomali sokuqeqeshwa sesikhathi esisodwa.

Ukuqaliswa Komhlaba Wangempela

Ukukhetha ukuqeqesha imodeli yepharamitha eyizigidi eziyizinkulungwane ezingu-7 kumathokheni angu-2 trillion kunokuba imodeli yezigidigidi ezingu-30 kudatha encane kakhulu yesabelomali esifanayo.

Ilinganisela ukuthi imodeli yepharamitha yebhiliyoni eyi-10 ifuna amathokheni acishe abe yizigidi eziyizinkulungwane ezingama-200 ukuze afinyelele endaweni emnandi yekhompiyutha.

Ukuthethelela imodeli encane esetshenzisiwe ukuze kuncishiswe izindleko ze-inference yombuzo ngamunye kuyilapho kuqhathaniswa nekhwalithi yembangi enkulu.

Ukuhlola imodeli ekhona nokuyiphetha akuqeqeshelwe phansi, bese kuhlelwa ukuqeqeshwa okude esikhundleni sokukhushulwa kwepharamitha.

Amaphethini Okusebenzisa

I-Chinchilla Compute-Optimal Training in practice

Ukukhetha ukuqeqesha imodeli yepharamitha eyizigidi eziyizinkulungwane ezingu-7 kumathokheni angu-2 trillion kunokuba imodeli yezigidigidi ezingu-30 kudatha encane kakhulu yesabelomali esifanayo.

Ukukhetha ukuqeqesha imodeli yepharamitha eyizigidi eziyizinkulungwane ezingu-7 kumathokheni ayizigidi eziyizinkulungwane ezingu-2 esikhundleni semodeli yebhiliyoni engu-30 kudatha encane kakhulu yesabelomali esifanayo Amaqembu ngokuvamile athola imiphumela engcono lapho echaza imingcele yekhwalithi ngaphambili, agcine indlela yokukhuphuka komuntu ngamacala asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-Chinchilla Compute-Optimal Training in practice

Ilinganisela ukuthi imodeli yepharamitha yebhiliyoni eyi-10 ifuna amathokheni acishe abe yizigidi eziyizinkulungwane ezingama-200 ukuze afinyelele endaweni emnandi yekhompiyutha.

Ilinganisela ukuthi imodeli yepharamitha eyizigidi eziyizinkulungwane eziyi-10 ifuna amathokheni acishe abe yizigidi eziyizinkulungwane ezingama-200 ukuze afinyelele inani eliphelele lekhompiyutha Amaqembu ngokuvamile athola imiphumela engcono lapho echaza imingcele yekhwalithi ngaphambili, agcine indlela yokukhuphuka kwabantu yamacala abucayi, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-Chinchilla Compute-Optimal Training in practice

Ukuthethelela imodeli encane esetshenzisiwe ukuze kuncishiswe izindleko ze-inference yombuzo ngamunye kuyilapho kuqhathaniswa nekhwalithi yembangi enkulu.

Ukuthethelela imodeli encane esetshenzisiwe ukuze kuncishiswe izindleko zokucatshangelwa kombuzo ngamunye kuyilapho kuqhathaniswa nekhwalithi Yezimbangi ezinkulu ngokuvamile athola imiphumela engcono lapho echaza izilinganiso zekhwalithi ngaphambili, agcine indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-Chinchilla Compute-Optimal Training in practice

Ukuhlola imodeli ekhona nokuyiphetha akuqeqeshelwe phansi, bese kuhlelwa ukuqeqeshwa okude esikhundleni sokukhushulwa kwepharamitha.

Ukuhlola imodeli ekhona kakade nokuphetha ngokuthi bekungaqeqeshwa kahle, bese kuhlelwa ukuqeqeshwa okude esikhundleni sokukhushulwa kwepharamitha Amaqembu ngokuvamile athola imiphumela engcono lapho echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka kwabantu yamacala abucayi, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Izingozi & Guardrails

!

Amaqembu ahlukene angasebenzisa igama elifanayo ngokuhlukile, ngakho chaza ububanzi kusenesikhathi.

!

Amabhentshimakhi angabukeka eqinile kuyilapho ukusebenza komhlaba wangempela kungalingani.

!

Ukuziba ikhwalithi yedatha nezinhlelo zokuhlaziya kuvame ukudala imiphumela entekenteke.

Ukuqalisa Umhlahlandlela

1

Qala ngencazelo yolimi olulula yomphumela oyidingayo.

Qala ngencazelo yolimi olulula yomphumela oyidingayo. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

2

Khetha imethrikhi eyodwa yempumelelo nesimo esisodwa sokuhluleka ngaphambi kokuhlolwa.

Khetha imethrikhi eyodwa yempumelelo nesimo esisodwa sokuhluleka ngaphambi kokuhlolwa. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

3

Qalisa umshayeli omncane onedatha emele, hhayi isethi yedemo ephucuziwe.

Qalisa umshayeli omncane onedatha emele, hhayi isethi yedemo ephucuziwe. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

4

Bhala lapho i-Chinchilla Compute-Optimal Training isiza khona nalapho izindlela ezilula zingcono.

Bhala lapho i-Chinchilla Compute-Optimal Training isiza khona nalapho izindlela ezilula zingcono. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

Qhubeka Uhlole