Mutauro AI GUIDE

Chinchilla Scaling Mitemo

Mitemo yeChinchilla yekuyera, kubva kuDeepMind muna 2022, yakaratidza kuti mhando dzemitauro mikuru mikuru yakanga isina kudzidziswa zvakanyanya: kune yakamisikidzwa komputa bhajeti, iwe unofanirwa kuyera saizi yemuenzaniso uye data rekudzidzisa zvakaenzana muchikamu chakaenzana.

Overview

Mitemo yeChinchilla yekuyera, kubva kuDeepMind muna 2022, yakaratidza kuti mhando dzemitauro mikuru mikuru yakanga isina kudzidziswa zvakanyanya: kune yakamisikidzwa komputa bhajeti, iwe unofanirwa kuyera saizi yemuenzaniso uye data rekudzidzisa zvakaenzana muchikamu chakaenzana. Izvo zvine basa nekuti yakatsanangudza kuti 'yakakwana' modhi inorevei uye yakagadziridza mashandisiro anoita maLab compute.

Chinchilla Scaling Laws chikamu chemutauro-AI stack inoshandiswa kuverenga, kugadzira, kuronga, uye kushandura zvinyorwa uye kutaura pamwero.

Deep Dive

Pamberi peChinchilla, maitiro aive ekuvaka mamodheru anogara akakura (seiyo 175B-parameter GPT-3) uchidzidzira nezve data ine mwero. DeepMind yakadzidzisa anopfuura mazana mana emodhi muhukuru hwakawanda uye mabhajeti edatha, yobva yakwana macurves kufanotaura kurasikirwa sebasa rema paramita uye tokens pasi peyakagadziriswa komputa (FLOP) bhajeti. Kuwana kwavo: maparamendi uye ma tokeni ekudzidzisa anofanirwa kuyera pamwe chete, ingangoita 1-kusvika-1 reshiyo, zvichireva nezve makumi maviri tokeni edhata rekudzidzisa paparameta. Kuti azviratidze, vakadzidzisa Chinchilla, 70B-parameter womuenzaniso pa 1,4 tiririyoni zviratidzo, izvo kupfuura zvikuru guru 280B-parameter Gopher pasinei kushandisa komputa imwe chete, nokuti akadzidziswa pamusoro kure zvikuru mashoko.

Technical Insight

Mitemo inobva pakukodzera parametric kurasikirwa basa L (N, D) uko N iri paramita uye D iri tokens, kusanganisira irreducible-kurasikirwa, modhi-saizi, uye data-saizi mazwi. Kuderedza kurasikirwa zvichienderana necompute constraining (compute inenge yakaenzana neN nguva D) inopa mhedzisiro yekuti iyo yakakwana N uye D zvese zvinokura sesimba rekombuta ine maexponents akafanana, saka compute-optimal reshiyo inogara pedyo ne20 tokens paparameter.

Kudzidzira Chinchilla Kuyera Mitemo

Mitemo yeChinchilla yekuyera, kubva kuDeepMind muna 2022, yakaratidza kuti mhando dzemitauro mikuru mikuru yakanga isina kudzidziswa zvakanyanya: kune yakamisikidzwa komputa bhajeti, iwe unofanirwa kuyera saizi yemuenzaniso uye data rekudzidzisa zvakaenzana muchikamu chakaenzana. Izvo zvine basa nekuti yakatsanangudza kuti 'yakakwana' modhi inorevei uye yakagadziridza mashandisiro anoita maLab compute. Chinchilla Scaling Laws chikamu chemutauro-AI stack inoshandiswa kuverenga, kugadzira, kuronga, uye kushandura zvinyorwa uye kutaura pamwero. Kuti uvake kunzwisisa kwakadzama, bata Mitemo yeChinchilla Scaling semuenzaniso wekushandisa, kwete chinhu chimwe chete: tsanangura mhedzisiro inodiwa, kujekesa fungidziro, uye patsanura zvinogona kuitwa nehurongwa hwakavimbika kubva kune zvichiri kuda kutonga kwenyanzvi.

Mukuita, zvikwata zvakasimba zvinoshandisa Chinchilla Scaling Laws dhizaini zvinokurudzira, kudzoreredza, uye kuongorora zvishwe seimwe yakabatanidzwa yekutaurirana system. Ivo vanonyora zvakajeka maitiro ebudiriro, bvunzo vachipokana ne data rechokwadi uye mafambiro ebasa, uye iterate zvichibva pane zvakacherechedzwa maitiro ekutadza kwete kuhwina-nguva imwe chete yebhenji. Apa ndipo apo kunzwisisa kwe theoretical kunoshanduka kuve kugona kwakasimba pane chigadzirwa, mutemo, uye mashandiro.

Mutauro workflows inogona kufamba nekukurumidza pasina kupira kuenderana. Panguva imwecheteyo, chokwadi cheHallucified chinogona kupinda chinyararire mishumo, kuyerera kwetsigiro, kana kutsvagisa zvinobuda. Nzira yakatsiga ndeyekubatanidza kukurumidza kuyedza nekutonga: mhanyisa vatyairi vendege, tora humbowo, buritsa matanda esarudzo, uye urambe uchivandudza chengetedzo semaitiro emuenzaniso, zvinotarisirwa nemushandisi, uye zvinodikanwa zvekutonga.

Strategic Impact

Mutauro workflows inogona kufamba nekukurumidza pasina kupira kuenderana.

Mutauro workflows inogona kufamba nekukurumidza pasina kupira kuenderana. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Inopamhidzira kupinda mumitauro yese nemataera ekutaurirana.

Inopamhidzira kupinda mumitauro yese nemataera ekutaurirana. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Zvikwata zvinogona kupedza nguva yakawanda pakutonga uku otomatiki ichibata kudzokorora.

Zvikwata zvinogona kupedza nguva yakawanda pakutonga uku otomatiki ichibata kudzokorora. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Ramangwana reChinchilla Kuyera Mitemo

Chinchilla yakashandura munda kubva pakudzingirira kuverenga kweparameta kuenda kumhando dzekudyisa data dzemhando yepamusoro, uye mhando dzemazuva ano dzinowanzo dzidzisa kupfuura iyo 'compute-optimal' poindi kuita kuti inferensi idhure. Sezvo zvinyorwa zvemhando yepamusoro zvewebhu zvave kushomeka, kutarisisa kuri kutendeukira kune data curation, synthetic data, akawanda epochs, uye multimodal data kuti irambe ichikwira. Chidzidzo chepakati chinotsungirira: data uye paramita zvinofanirwa kuve zvakaenzana, uye saizi yakasvibira chete haisisiri chinangwa.

Real-World Implementation

DeepMind's 70B-parameter Chinchilla ichirova 280B Gopher pamabenchmarks vachishandisa compute yakaenzana, nekudzidziswa pane mamwe data.

Kutungamira zvikwata kubhajeti inosvika makumi maviri ekudzidziswa tokeni paparameta kana uchironga kubva-mukutanga modhi

Kururamisa madiki, ane data-akapfuma modhi seLLaMA ayo akachipa kumhanya panguva yekufungidzira

Kufungidzira kana modhi yakarongwa 'yakadzidziswa' uye ingabatsirike zvakanyanya kubva kune yakawedzera data pane yakawedzera paramita

Maitiro Ekuita

Chinchilla Scaling Mitemo mukuita

DeepMind's 70B-parameter Chinchilla ichirova 280B Gopher pamabenchmarks vachishandisa compute yakaenzana, nekudzidziswa pane zvakawanda zvakanyanya data.

DeepMind's 70B-parameter Chinchilla ichirova 280B Gopher pamabenchmarks ichishandisa yakaenzana compute, nekudzidziswa pane yakawandisa data Matimu anowanzo kuwana mhedzisiro iri nani kana achinge atsanangura emhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa nemitengo yekukanganisa nekufamba kwenguva.

Chinchilla Scaling Mitemo mukuita

Kutungamira zvikwata kubhajeti inosvika makumi maviri ekudzidziswa tokeni paparameta kana uchironga kubva-mukutanga modhi.

Zvikwata zvinotungamira kubhajeti inosvika makumi maviri ekudzidziswa tokeni paparameta pakuronga kubva-mukutanga modhi Matimu anowanzo kuwana mhedzisiro iri nani kana achinge atsanangura emhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa nemitengo yekukanganisa nekufamba kwenguva.

Chinchilla Scaling Mitemo mukuita

Kururamisa madiki, ane data-akapfuma modhi seLLaMA ayo akachipa kumhanya panguva yekufungidzira.

Kururamisa madiki, ane data-akapfuma modhi seLLaMA ayo akachipa kumhanya panguva yekufungidzira Matimu anowanzo kuwana mhedzisiro iri nani kana achinge atsanangura emhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi ekumucheto, uye kuteedzera zvese zvakawanikwa zvechigadzirwa nemitengo yekukanganisa nekufamba kwenguva.

Chinchilla Scaling Mitemo mukuita

Kufungidzira kana modhi yakarongwa 'yakadzidziswa' uye ingabatsirike zvakanyanya kubva kune yakawedzera data pane yakawedzera paramita.

Kufungidzira kana modhi yakarongwa 'yakadzidziswa' uye yaizobatsirikana zvakanyanya kubva kune yakawedzera data pane yakawedzera paramita Matimu anowanzo kuwana mhedzisiro iri nani kana achinge atsanangura emhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa nemitengo yekukanganisa nekufamba kwenguva.

Njodzi & Guardrails

!

Chokwadi chehuroyi chinogona kupinda chinyararire mishumo, kuyerera kwetsigiro, kana tsvakiridzo.

!

Kunzwa nekukasira kunogona kugadzira mhedzisiro isingaenderane pane zvikumbiro zvakafanana.

!

Sensitive text data inogona kuburitswa kana zvidhiraivho zvisina kusimba.

Implementation Roadmap

1

Tsanangura chimiro chekubuda, toni, uye mhando zviyero usati waburitsa.

Tsanangura chimiro chekubuda, toni, uye mhando zviyero usati waburitsa. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

2

Mhinduro dzepasi neakavimbika masosi pese pazvine basa.

Mhinduro dzepasi neakavimbika masosi pese pazvine basa. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

3

Chengetedza ongororo yekuongorora yemunhu kune yakakwira-stake zvinobuda.

Chengetedza ongororo yekuongorora yemunhu kune yakakwira-stake zvinobuda. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

4

Tevera maitiro ekutadza uye dzidzisazve kukurudzira kana mafambiro ebasa nguva nenguva.

Tevera maitiro ekutadza uye dzidzisazve kukurudzira kana mafambiro ebasa nguva nenguva. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

Ramba Uchiongorora