UMHLAHLANDLELA Wobuchwepheshe

FP8 kanye namafomethi anembe eliphansi

I-FP8 iyifomethi yenombolo yephoyinti elintantayo engu-8-bit evumela amamodeli e-AI ukuthi agcine izisindo futhi asebenzise izibalo esebenzisa ingxenye yesine yenkumbulo yezinombolo ezijwayelekile ezingamabhithi angu-32.

Uhlolojikelele

I-FP8 iyifomethi yenombolo yephoyinti elintantayo engu-8-bit evumela amamodeli e-AI ukuthi agcine izisindo futhi asebenzise izibalo esebenzisa ingxenye yesine yenkumbulo yezinombolo ezijwayelekile ezingamabhithi angu-32. Kuyiqhinga elibalulekile lokwenza amamodeli amakhulu ashibhe futhi asheshe ukuwaqeqesha nokuwaphakela.

I-FP8 kanye ne-Low-Precision Formats iyibhulokhi yokwakha yobuchwepheshe ethinta ikhwalithi yemodeli, izindleko zengqalasizinda, ukubambezeleka, nokuthembeka esikalini.

I-Deep Dive

Amanethiwekhi e-Neural enziwe ngezigidigidi zezinombolo. Ngokwesiko lezo zinombolo zazisebenzisa amabhithi angu-32 (FP32) noma amabhithi angu-16 (FP16/BF16) ngayinye. I-FP8 iwashwabanisa abe amabhithi angu-8 kuphela, inqamula inkumbulo kanye nomkhawulokudonsa cishe uhhafu uma uqhathaniswa no-16-bit. Kunezakhiwo ezimbili ezivamile ze-FP8: I-E4M3 (ama-exponent bits, 3 mantissa bits) inikeza ukunemba okwengeziwe kodwa ububanzi obuncane, futhi i-E5M2 (i-exponent 5, 2 mantissa) inikeza ububanzi obubanzi kodwa izinyathelo ezimaholoholo. Ukuhwebelana kungukuthembeka: izingcezu ezimbalwa zisho amaphutha okusondeza. Ukuze kuhlale kunembile, izinhlaka zisebenzisa i-tensor ngayinye noma ibhlokhi ngayinye ezikala amanani zibe ibanga elisebenzisekayo le-FP8. I-NVIDIA's Hopper kanye ne-Blackwell GPUs yengeze izinjini ze-matrix ze-hardware ze-FP8, ziyenza isebenze kukho kokubili ukuqeqeshwa nokuchazwa. Amafomethi amasha afana ne-MXFP8, MXFP4, ne-NVFP4 aphusha ngisho naphansi ngamabhulokhi okukala amancane okwabelwana ngawo.

I-Technical Insight

Inselele ye-FP8 ububanzi obuguqukayo. Ngamabhithi ama-eksponenti ambalwa kuphela, ukwenza kusebenze okukhulu noma okuncane kuyachichima noma kudlulele kuqanda. Ukulungiswa kuwukukala: phindaphinda i-tensor nge-factor ukuze amanani ayo ahlale efasiteleni elimelelekayo le-FP8, yenza i-FP8 iphindaphindeke, bese ihlukanisa iphinde ikhiphe, ngokuvamile iqongelela isamba semali esiyingxenye ngokunemba okuphezulu (FP16/FP32). I-E4M3 ivamise ukusetshenziselwa izisindo nokwenza kusebenze, i-E5M2 yamagradient lapho ububanzi bubaluleke kakhulu kunokunemba.

Ukwenza kahle i-FP8 kanye namafomethi anembe eliphansi

I-FP8 iyifomethi yenombolo yephoyinti elintantayo engu-8-bit evumela amamodeli e-AI ukuthi agcine izisindo futhi asebenzise izibalo esebenzisa ingxenye yesine yenkumbulo yezinombolo ezijwayelekile ezingamabhithi angu-32. Kuyiqhinga elibalulekile lokwenza amamodeli amakhulu ashibhe futhi asheshe ukuwaqeqesha nokuwaphakela. I-FP8 kanye ne-Low-Precision Formats iyibhulokhi yokwakha yobuchwepheshe ethinta ikhwalithi yemodeli, izindleko zengqalasizinda, ukubambezeleka, nokuthembeka esikalini. Ukuze wakhe ukuqonda okujulile, phatha i-FP8 kanye ne-Low-Precision Formats njengemodeli yokusebenza, hhayi isici esisodwa: chaza imiphumela efiselekayo, ucacise ukucabanga, futhi uhlukanise lokho uhlelo olungakwenza ngokwethembeka kulokho okusadinga ukwahlulela kochwepheshe.

Empeleni, amaqembu aqinile asebenzisa i-FP8 kanye ne-Low-Precision Formats alungiselela izakhiwo, idatha, nokukhetha kwengqalasizinda ngokumelene nokuthembeka nezindleko. Babhala imibandela yempumelelo ecacile, ukuhlola okuqhathaniswa nedatha engokoqobo nokugeleza komsebenzi, futhi baphindaphinde ngokusekelwe kumaphethini okuhluleka aqashiwe esikhundleni sokuwina kwebhentshimakhi yesikhathi esisodwa. Yilapho ukuqonda kwethiyori kuguquka kube amandla ahlala njalo kuwo wonke umkhiqizo, inqubomgomo, kanye nokusebenza.

Izinqumo zezakhiwo ziqhuba ukusebenza kanye nezindleko zokusebenza iminyaka. Ngesikhathi esifanayo, Ukuthuthukisa ibhentshimakhi eyodwa kungafihla ubuthakathaka obubanzi besistimu. Indlela eqine kakhulu iwukuhlanganisa isivinini sokuhlola nesiyalo sokuphatha: qhuba abashayeli bezindiza, bamba ubufakazi, ushicilele amalogi ezinqumo, futhi ubuyekeze izivikelo ngokuqhubekayo njengoba imodeli yokuziphatha, okulindelwe ngabasebenzisi, kanye nezimfuneko zokulawula zishintsha.

I-Strategic Impact

Izinqumo zezakhiwo ziqhuba ukusebenza kanye nezindleko zokusebenza iminyaka.

Izinqumo zezakhiwo ziqhuba ukusebenza kanye nezindleko zokusebenza iminyaka. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Imfundo yobuchwepheshe isiza amaqembu ukuthi akhethe isitaki esifanele, hhayi nje esisha.

Imfundo yobuchwepheshe isiza amaqembu ukuthi akhethe isitaki esifanele, hhayi nje esisha. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Izinketho ezingcono zobunjiniyela zinciphisa izehlakalo ezinokwethenjelwa ekukhiqizeni.

Izinketho ezingcono zobunjiniyela zinciphisa izehlakalo ezinokwethenjelwa ekukhiqizeni. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Ikusasa le-FP8 kanye namafomethi anembe eliphansi

Ukunemba kugijimela phansi. Ngemuva kwe-FP8 kufike amafomethi we-4-bit micro-scaling (MXFP4, NVFP4) apakisha isikali esabiwe esincane ngebhulokhi ngayinye encane, kanye ne-Blackwell hardware manje isheshisa i-FP4 ngokuqondile. Lindela amaresiphi anembe okuxubile lapho izendlalelo ezihlukene zisebenzisa ububanzi bebhithi obuhlukile, kanye nokuqeqeshwa okungcono kokwazi ukulinganisa ukuze i-4-bit ibe yiyona ndlela ezenzakalelayo yokucabanga. Isiphetho segeyimu sicindezela amamodeli esikali somngcele kuma-chips ambalwa, ashibhile ngaphandle kokulahlekelwa kwekhwalithi elinganisekayo.

Ukuqaliswa Komhlaba Wangempela

Ukuqeqesha amamodeli olimi amakhulu kuma-NVIDIA Hopper/Blackwell GPUs usebenzisa i-FP8 ukuze acishe aphindwe kabili uma kuqhathaniswa ne-BF16.

Inikeza i-chatbot inference ku-FP8 ukuze imodeli ilingane kuma-GPU ambalwa futhi iphendule izicelo eziningi ngomzuzwana.

Ukusebenzisa i-E5M2 ekuxhumaneni kwe-gradient ngesikhathi sokuqeqeshwa okusabalalisiwe ukusika umkhawulokudonsa wenethiwekhi phakathi kwamanodi

Kusetshenziswa amamodeli we-MXFP4/NVFP4-quantized ukuze alingane imodeli yesikali somngcele ku-GPU eyodwa yememori ephezulu ukuze uthole ukucatshangelwa okushibhile

Amaphethini Okusebenzisa

I-FP8 kanye namafomethi wokunemba okuphansi ayasebenza

Ukuqeqesha amamodeli olimi amakhulu kuma-NVIDIA Hopper/Blackwell GPU esebenzisa i-FP8 ukuze acishe aphindwe kabili uma kuqhathaniswa ne-BF16.

Ukuqeqesha amamodeli olimi amakhulu kuma-NVIDIA Hopper/Blackwell GPU asebenzisa i-FP8 ukuze acishe aphindeke kabili uma kuqhathaniswa namaQembu e-BF16 ngokuvamile athola imiphumela engcono lapho echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-FP8 kanye namafomethi wokunemba okuphansi ayasebenza

Inikeza i-chatbot inference ku-FP8 ukuze imodeli ilingane kuma-GPU ambalwa futhi iphendule izicelo eziningi ngomzuzwana.

Inikeza incazelo ye-chatbot ku-FP8 ukuze imodeli ilingane kuma-GPU ambalwa futhi iphendule izicelo eziningi ngomzuzwana Amathimba ngokuvamile athola imiphumela engcono lapho echaza ikhwalithi ephezulu ngaphambili, egcina indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-FP8 kanye namafomethi wokunemba okuphansi ayasebenza

Ukusebenzisa i-E5M2 ekuxhumaneni kwe-gradient ngesikhathi sokuqeqeshwa okusabalalisiwe ukusika umkhawulokudonsa wenethiwekhi phakathi kwamanodi.

Ukusebenzisa i-E5M2 ekuxhumaneni kwe-gradient ngesikhathi sokuqeqeshwa okusabalalisiwe ukuze kunqandwe umkhawulokudonsa wenethiwekhi phakathi kwamanodi Amaqembu ngokuvamile athola imiphumela engcono uma echaza imingcele yekhwalithi ngaphambili, egcina indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-FP8 kanye namafomethi wokunemba okuphansi ayasebenza

Kusetshenziswa amamodeli e-MXFP4/NVFP4-quantized ukuze alingane imodeli yesikali somngcele ku-GPU eyodwa yenkumbulo ephezulu ukuze uthole ukucatshangelwa okushibhile.

Kusetshenziswa amamodeli anesilinganiso se-MXFP4/NVFP4 ukuze alingane nemodeli yesikali somngcele ku-GPU eyodwa yenkumbulo ephezulu ukuze uthole iziqondiso ezishibhile Amathimba ngokuvamile athola imiphumela engcono lapho echaza imingcele yekhwalithi ngaphambili, agcine indlela yokukhuphuka komuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Izingozi & Guardrails

!

Ukuthuthukisa ibhentshimakhi eyodwa kungafihla ubuthakathaka obubanzi besistimu.

!

Izindleko zengqalasizinda nezokulungisa zivame ukubukelwa phansi.

!

Izikhala zokuphepha nokubonakala zingakhula njengoba izinhlelo ziba nzima kakhulu.

Ukuqalisa Umhlahlandlela

1

Chaza ukubambezeleka, ikhwalithi, nezindleko ezihlosiwe ngaphambi kokuqaliswa.

Chaza ukubambezeleka, ikhwalithi, nezindleko ezihlosiwe ngaphambi kokuqaliswa. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

2

Ibhentshimakhi ngaphansi komthwalo wangempela nezimo zedatha.

Ibhentshimakhi ngaphansi komthwalo wangempela nezimo zedatha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

3

Ukuqapha amathuluzi amaphutha, ukukhukhuleka, nomthelela wabasebenzisi.

Ukuqapha amathuluzi amaphutha, ukukhukhuleka, nomthelela wabasebenzisi. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

4

Lungiselela izindlela zokuhlehlisa nezigameko ngaphambi kokukala.

Lungiselela izindlela zokuhlehlisa nezigameko ngaphambi kokukala. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

Qhubeka Uhlole