UMHLAHLANDLELA Wobuchwepheshe

Ukuhunyushwa Kwesikhundla Komongo omude

I-Positional Interpolation (PI) iyindlela elula, enethonya enweba iwindi lomongo we-Transformer ngokucindezela izinkomba zendawo entsha kububanzi imodeli esiyazi kakade.

Uhlolojikelele

I-Positional Interpolation (PI) iyindlela elula, enethonya enweba iwindi lomongo we-Transformer ngokucindezela izinkomba zendawo entsha kububanzi imodeli esiyazi kakade. Esikhundleni sokudlulela ezindaweni ezingabonakali, ingena phakathi kwabaqeqeshiwe, idinga ukulungiswa okufushane nje.

I-Positional Interpolation for Long Context ibhulokhi yokwakha yobuchwepheshe ethinta ikhwalithi yemodeli, izindleko zengqalasizinda, ukubambezeleka, nokuthembeka esikalini.

I-Deep Dive

Yethulwe abacwaningi be-Meta (Chen et al.) ngo-2023, I-Positional Interpolation ibhekana neqiniso lokuthi amamodeli ane-RoPE ahluleka ngendlela eyingozi uma edlulela ezikhundleni ezingaphezu kokuqeqeshwa. Ukuqonda kuphambene: kunokucela imodeli ukuthi isingathe amanani ezindawo amakhulu engakaze iwabone, i-PI ihlukanisa izinkomba zendawo engenayo ngesikali ukuze ubude obuqondiwe, okuthi, 8K bubuyele kububanzi bangempela be-2K. Ngenxa yokuthi imodeli yaqeqeshwa kulolo bubanzi, ukuzungezisa kuhlala kusatshalaliswa. Ngemuva kwezinyathelo zokushuna kahle eziyi-1,000 nje kuphela, imodeli ye-LLaMA inwetshwe ngale ndlela isingathwe kufika kumongo ongu-32K. Iphepha libonise ukuthi i-extrapolation ingaqhumisa amaphuzu wokunaka kumanani amakhulu, kuyilapho ukuhumusha kugcine kuboshwe futhi kuzinzile, yingakho ukuhumusha kusebenza kangcono kakhulu kunokwengeza.

I-Technical Insight

I-PI ilinganisa kabusha indawo kusuka ku-m kuya ku-m/s lapho u-s eyisici sokunweba (isb., ubude obusha buhlukaniswe ngobude bangempela). Nge-RoPE lokhu kusifinyeza ngempumelelo isinyathelo sokuphenduka phakathi kwezindawo eziseduze, kupakishe izindawo eziningi ebangeni le-angular eliqeqeshiwe. Ukuboshelwa kwethiyori ephepheni kukhombisa ukuthi amaphuzu okunaka ahlanganisiwe ahlala elawulwa kahle, kanti i-naive extrapolation ingaveza ama-oda wamaphuzu amakhulu kunanoma yini ebonwa ekuqeqesheni, ephazamisa i-softmax.

I-Mastering Positional Interpolation for Long Content

I-Positional Interpolation (PI) iyindlela elula, enethonya enweba iwindi lomongo we-Transformer ngokucindezela izinkomba zendawo entsha kububanzi imodeli esiyazi kakade. Esikhundleni sokudlulela ezindaweni ezingabonakali, ingena phakathi kwabaqeqeshiwe, idinga ukulungiswa okufushane nje. I-Positional Interpolation for Long Context ibhulokhi yokwakha yobuchwepheshe ethinta ikhwalithi yemodeli, izindleko zengqalasizinda, ukubambezeleka, nokuthembeka esikalini. Ukuze wakhe ukuqonda okujulile, phatha i-Positional Interpolation for Long Context njengemodeli yokusebenza, hhayi isici esisodwa: chaza imiphumela efiselekayo, ucacise ukucabanga, futhi uhlukanise lokho isistimu engakwenza ngokwethembeka kulokho okusadinga ukwahlulela kochwepheshe.

Empeleni, amaqembu aqinile asebenzisa i-Positional Interpolation for Long Context alungiselela izakhiwo, idatha, nokukhetha kwengqalasizinda ngokumelene nokuthembeka nezindleko. Babhala imibandela yempumelelo ecacile, ukuhlola okuqhathaniswa nedatha engokoqobo nokugeleza komsebenzi, futhi baphindaphinde ngokusekelwe kumaphethini okuhluleka aqashiwe esikhundleni sokuwina kwebhentshimakhi yesikhathi esisodwa. Yilapho ukuqonda kwethiyori kuguquka kube amandla ahlala njalo kuwo wonke umkhiqizo, inqubomgomo, kanye nokusebenza.

Izinqumo zezakhiwo ziqhuba ukusebenza kanye nezindleko zokusebenza iminyaka. Ngesikhathi esifanayo, Ukuthuthukisa ibhentshimakhi eyodwa kungafihla ubuthakathaka obubanzi besistimu. Indlela eqine kakhulu iwukuhlanganisa isivinini sokuhlola nesiyalo sokuphatha: qhuba abashayeli bezindiza, bamba ubufakazi, ushicilele amalogi ezinqumo, futhi ubuyekeze izivikelo ngokuqhubekayo njengoba imodeli yokuziphatha, okulindelwe ngabasebenzisi, kanye nezimfuneko zokulawula zishintsha.

I-Strategic Impact

Izinqumo zezakhiwo ziqhuba ukusebenza kanye nezindleko zokusebenza iminyaka.

Izinqumo zezakhiwo ziqhuba ukusebenza kanye nezindleko zokusebenza iminyaka. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Imfundo yobuchwepheshe isiza amaqembu ukuthi akhethe isitaki esifanele, hhayi nje esisha.

Imfundo yobuchwepheshe isiza amaqembu ukuthi akhethe isitaki esifanele, hhayi nje esisha. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Izinketho ezingcono zobunjiniyela zinciphisa izehlakalo ezinokwethenjelwa ekukhiqizeni.

Izinketho ezingcono zobunjiniyela zinciphisa izehlakalo ezinokwethenjelwa ekukhiqizeni. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Ikusasa Lokuhunyushwa Kwesikhundla Kokuqukethwe Okude

I-Positional Interpolation ibe yisisekelo segagasi lokulandela, okuhlanganisa ukukala kwe-NTK-aware kanye ne-YaRN, ehumusha ngokukhetha kakhudlwana ukuze kulondolozwe imininingwane yendawo. I-trajectory ibheke ezindleleni ezidinga ukulungiswa okuncane noma ezingasho lutho futhi zibheke ekubhakeni ukuphatha kokuqukethwe okude ekuqeqesheni kwangaphambili. I-PI ihlala iyisisekelo esibalulekile futhi ivamise ukuhlanganiswa nezikimu ezintsha zokuqaphela imvamisa ukuze ifinyelele umongo ongu-128K-plus windows kahle.

Ukuqaliswa Komhlaba Wangempela

Ukunweba imodeli ye-LLaMA yokuqukethwe okungu-2K ukuze kuphathwe amathokheni e-8K-32K ngezinyathelo zokushuna kahle ezingaba ngu-1,000

Ukujwayela imodeli yengxoxo ekhona yokufingqa idokhumenti ende ngaphandle kokuqeqeshwa kabusha kusukela ekuqaleni

Isebenza njengesisekelo somqondo lapho ukukala kwe-NTK-aware kanye ne-YaRN kuthuthukisa khona

Ivumela ikhodi yomongo omude noma ukuhlaziywa kwedokhumenti yomthetho kumamodeli aqeqeshwe ngamawindi amafushane

Amaphethini Okusebenzisa

I-Positional Interpolation for Long Context in practice

Ukunweba imodeli ye-LLaMA yokuqukethwe okungu-2K ukuze kuphathwe amathokheni e-8K-32K ngezinyathelo zokushuna kahle ezingaba ngu-1,000.

Ukunweba imodeli ye-LLaMA yokuqukethwe okungu-2K ukuze kuphathwe amathokheni angu-8K-32K anezinyathelo zokushuna kahle ezingaba ngu-1,000 Amaqembu ngokuvamile athola imiphumela engcono uma echaza imingcele yekhwalithi ngaphambili, agcine indlela yokukhuphuka komuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-Positional Interpolation for Long Context in practice

Ukujwayela imodeli yengxoxo ekhona yokufingqa idokhumenti ende ngaphandle kokuqeqeshwa kabusha kusukela ekuqaleni.

Ukujwayela imodeli yengxoxo ekhona yokufingqa amadokhumenti amade ngaphandle kokuqeqeshwa kabusha kusukela ekuqaleni Amaqembu ngokuvamile athola imiphumela engcono uma echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi elandelela kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-Positional Interpolation for Long Context in practice

Isebenza njengesisekelo somqondo lapho ukukala kwe-NTK-aware kanye ne-YaRN kuthuthukisa khona.

Isebenza njengesisekelo somqondo sokuthi ukukala kwe-NTK-aware kanye ne-YaRN kuthuthukisa phezu kwamaThimba ngokuvamile athola imiphumela engcono uma echaza izilinganiso zekhwalithi ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-Positional Interpolation for Long Context in practice

Ivumela ikhodi yomongo omude noma ukuhlaziywa kwedokhumenti yomthetho kumamodeli aqeqeshwe ngamawindi amafushane.

Ukunika amandla ikhodi yomongo omude noma ukuhlaziywa kwedokhumenti yomthetho kumamodeli ekuqaleni aqeqeshwe ngamafushane amafasitela Amaqembu ngokuvamile athola imiphumela engcono lapho echaza imingcele yekhwalithi ngaphambili, egcina indlela yokukhuphuka yomuntu yamacala asemaphethelweni, futhi elandelela kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Izingozi & Guardrails

!

Ukuthuthukisa ibhentshimakhi eyodwa kungafihla ubuthakathaka obubanzi besistimu.

!

Izindleko zengqalasizinda nezokulungisa zivame ukubukelwa phansi.

!

Izikhala zokuphepha nokubonakala zingakhula njengoba izinhlelo ziba nzima kakhulu.

Ukuqalisa Umhlahlandlela

1

Chaza ukubambezeleka, ikhwalithi, nezindleko ezihlosiwe ngaphambi kokuqaliswa.

Chaza ukubambezeleka, ikhwalithi, nezindleko ezihlosiwe ngaphambi kokuqaliswa. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

2

Ibhentshimakhi ngaphansi komthwalo wangempela nezimo zedatha.

Ibhentshimakhi ngaphansi komthwalo wangempela nezimo zedatha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

3

Ukuqapha amathuluzi amaphutha, ukukhukhuleka, nomthelela wabasebenzisi.

Ukuqapha amathuluzi amaphutha, ukukhukhuleka, nomthelela wabasebenzisi. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

4

Lungiselela izindlela zokuhlehlisa nezigameko ngaphambi kokukala.

Lungiselela izindlela zokuhlehlisa nezigameko ngaphambi kokukala. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

Qhubeka Uhlole