Uhlolojikelele
I-FP8 iyifomethi yenombolo yephoyinti elintantayo engu-8-bit evumela amamodeli e-AI ukuthi agcine izisindo futhi asebenzise izibalo esebenzisa ingxenye yesine yenkumbulo yezinombolo ezijwayelekile ezingamabhithi angu-32. Kuyiqhinga elibalulekile lokwenza amamodeli amakhulu ashibhe futhi asheshe ukuwaqeqesha nokuwaphakela.
I-FP8 kanye ne-Low-Precision Formats iyibhulokhi yokwakha yobuchwepheshe ethinta ikhwalithi yemodeli, izindleko zengqalasizinda, ukubambezeleka, nokuthembeka esikalini.
I-Deep Dive
Amanethiwekhi e-Neural enziwe ngezigidigidi zezinombolo. Ngokwesiko lezo zinombolo zazisebenzisa amabhithi angu-32 (FP32) noma amabhithi angu-16 (FP16/BF16) ngayinye. I-FP8 iwashwabanisa abe amabhithi angu-8 kuphela, inqamula inkumbulo kanye nomkhawulokudonsa cishe uhhafu uma uqhathaniswa no-16-bit. Kunezakhiwo ezimbili ezivamile ze-FP8: I-E4M3 (ama-exponent bits, 3 mantissa bits) inikeza ukunemba okwengeziwe kodwa ububanzi obuncane, futhi i-E5M2 (i-exponent 5, 2 mantissa) inikeza ububanzi obubanzi kodwa izinyathelo ezimaholoholo. Ukuhwebelana kungukuthembeka: izingcezu ezimbalwa zisho amaphutha okusondeza. Ukuze kuhlale kunembile, izinhlaka zisebenzisa i-tensor ngayinye noma ibhlokhi ngayinye ezikala amanani zibe ibanga elisebenzisekayo le-FP8. I-NVIDIA's Hopper kanye ne-Blackwell GPUs yengeze izinjini ze-matrix ze-hardware ze-FP8, ziyenza isebenze kukho kokubili ukuqeqeshwa nokuchazwa. Amafomethi amasha afana ne-MXFP8, MXFP4, ne-NVFP4 aphusha ngisho naphansi ngamabhulokhi okukala amancane okwabelwana ngawo.
I-Technical Insight
Inselele ye-FP8 ububanzi obuguqukayo. Ngamabhithi ama-eksponenti ambalwa kuphela, ukwenza kusebenze okukhulu noma okuncane kuyachichima noma kudlulele kuqanda. Ukulungiswa kuwukukala: phindaphinda i-tensor nge-factor ukuze amanani ayo ahlale efasiteleni elimelelekayo le-FP8, yenza i-FP8 iphindaphindeke, bese ihlukanisa iphinde ikhiphe, ngokuvamile iqongelela isamba semali esiyingxenye ngokunemba okuphezulu (FP16/FP32). I-E4M3 ivamise ukusetshenziselwa izisindo nokwenza kusebenze, i-E5M2 yamagradient lapho ububanzi bubaluleke kakhulu kunokunemba.
Ukwenza kahle i-FP8 kanye namafomethi anembe eliphansi
I-FP8 iyifomethi yenombolo yephoyinti elintantayo engu-8-bit evumela amamodeli e-AI ukuthi agcine izisindo futhi asebenzise izibalo esebenzisa ingxenye yesine yenkumbulo yezinombolo ezijwayelekile ezingamabhithi angu-32. Kuyiqhinga elibalulekile lokwenza amamodeli amakhulu ashibhe futhi asheshe ukuwaqeqesha nokuwaphakela. I-FP8 kanye ne-Low-Precision Formats iyibhulokhi yokwakha yobuchwepheshe ethinta ikhwalithi yemodeli, izindleko zengqalasizinda, ukubambezeleka, nokuthembeka esikalini. Ukuze wakhe ukuqonda okujulile, phatha i-FP8 kanye ne-Low-Precision Formats njengemodeli yokusebenza, hhayi isici esisodwa: chaza imiphumela efiselekayo, ucacise ukucabanga, futhi uhlukanise lokho uhlelo olungakwenza ngokwethembeka kulokho okusadinga ukwahlulela kochwepheshe.
Empeleni, amaqembu aqinile asebenzisa i-FP8 kanye ne-Low-Precision Formats alungiselela izakhiwo, idatha, nokukhetha kwengqalasizinda ngokumelene nokuthembeka nezindleko. Babhala imibandela yempumelelo ecacile, ukuhlola okuqhathaniswa nedatha engokoqobo nokugeleza komsebenzi, futhi baphindaphinde ngokusekelwe kumaphethini okuhluleka aqashiwe esikhundleni sokuwina kwebhentshimakhi yesikhathi esisodwa. Yilapho ukuqonda kwethiyori kuguquka kube amandla ahlala njalo kuwo wonke umkhiqizo, inqubomgomo, kanye nokusebenza.
Izinqumo zezakhiwo ziqhuba ukusebenza kanye nezindleko zokusebenza iminyaka. Ngesikhathi esifanayo, Ukuthuthukisa ibhentshimakhi eyodwa kungafihla ubuthakathaka obubanzi besistimu. Indlela eqine kakhulu iwukuhlanganisa isivinini sokuhlola nesiyalo sokuphatha: qhuba abashayeli bezindiza, bamba ubufakazi, ushicilele amalogi ezinqumo, futhi ubuyekeze izivikelo ngokuqhubekayo njengoba imodeli yokuziphatha, okulindelwe ngabasebenzisi, kanye nezimfuneko zokulawula zishintsha.
I-Strategic Impact
Izinqumo zezakhiwo ziqhuba ukusebenza kanye nezindleko zokusebenza iminyaka.
Izinqumo zezakhiwo ziqhuba ukusebenza kanye nezindleko zokusebenza iminyaka. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.
Imfundo yobuchwepheshe isiza amaqembu ukuthi akhethe isitaki esifanele, hhayi nje esisha.
Imfundo yobuchwepheshe isiza amaqembu ukuthi akhethe isitaki esifanele, hhayi nje esisha. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.
Izinketho ezingcono zobunjiniyela zinciphisa izehlakalo ezinokwethenjelwa ekukhiqizeni.
Izinketho ezingcono zobunjiniyela zinciphisa izehlakalo ezinokwethenjelwa ekukhiqizeni. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.
Ukuqaliswa Komhlaba Wangempela
Ukuqeqesha amamodeli olimi amakhulu kuma-NVIDIA Hopper/Blackwell GPUs usebenzisa i-FP8 ukuze acishe aphindwe kabili uma kuqhathaniswa ne-BF16.
Inikeza i-chatbot inference ku-FP8 ukuze imodeli ilingane kuma-GPU ambalwa futhi iphendule izicelo eziningi ngomzuzwana.
Ukusebenzisa i-E5M2 ekuxhumaneni kwe-gradient ngesikhathi sokuqeqeshwa okusabalalisiwe ukusika umkhawulokudonsa wenethiwekhi phakathi kwamanodi
Kusetshenziswa amamodeli we-MXFP4/NVFP4-quantized ukuze alingane imodeli yesikali somngcele ku-GPU eyodwa yememori ephezulu ukuze uthole ukucatshangelwa okushibhile
Amaphethini Okusebenzisa
I-FP8 kanye namafomethi wokunemba okuphansi ayasebenza
Ukuqeqesha amamodeli olimi amakhulu kuma-NVIDIA Hopper/Blackwell GPU esebenzisa i-FP8 ukuze acishe aphindwe kabili uma kuqhathaniswa ne-BF16.
Ukuqeqesha amamodeli olimi amakhulu kuma-NVIDIA Hopper/Blackwell GPU asebenzisa i-FP8 ukuze acishe aphindeke kabili uma kuqhathaniswa namaQembu e-BF16 ngokuvamile athola imiphumela engcono lapho echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.
I-FP8 kanye namafomethi wokunemba okuphansi ayasebenza
Inikeza i-chatbot inference ku-FP8 ukuze imodeli ilingane kuma-GPU ambalwa futhi iphendule izicelo eziningi ngomzuzwana.
Inikeza incazelo ye-chatbot ku-FP8 ukuze imodeli ilingane kuma-GPU ambalwa futhi iphendule izicelo eziningi ngomzuzwana Amathimba ngokuvamile athola imiphumela engcono lapho echaza ikhwalithi ephezulu ngaphambili, egcina indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.
I-FP8 kanye namafomethi wokunemba okuphansi ayasebenza
Ukusebenzisa i-E5M2 ekuxhumaneni kwe-gradient ngesikhathi sokuqeqeshwa okusabalalisiwe ukusika umkhawulokudonsa wenethiwekhi phakathi kwamanodi.
Ukusebenzisa i-E5M2 ekuxhumaneni kwe-gradient ngesikhathi sokuqeqeshwa okusabalalisiwe ukuze kunqandwe umkhawulokudonsa wenethiwekhi phakathi kwamanodi Amaqembu ngokuvamile athola imiphumela engcono uma echaza imingcele yekhwalithi ngaphambili, egcina indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.
I-FP8 kanye namafomethi wokunemba okuphansi ayasebenza
Kusetshenziswa amamodeli e-MXFP4/NVFP4-quantized ukuze alingane imodeli yesikali somngcele ku-GPU eyodwa yenkumbulo ephezulu ukuze uthole ukucatshangelwa okushibhile.
Kusetshenziswa amamodeli anesilinganiso se-MXFP4/NVFP4 ukuze alingane nemodeli yesikali somngcele ku-GPU eyodwa yenkumbulo ephezulu ukuze uthole iziqondiso ezishibhile Amathimba ngokuvamile athola imiphumela engcono lapho echaza imingcele yekhwalithi ngaphambili, agcine indlela yokukhuphuka komuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.
Izingozi & Guardrails
Ukuthuthukisa ibhentshimakhi eyodwa kungafihla ubuthakathaka obubanzi besistimu.
Izindleko zengqalasizinda nezokulungisa zivame ukubukelwa phansi.
Izikhala zokuphepha nokubonakala zingakhula njengoba izinhlelo ziba nzima kakhulu.
Ukuqalisa Umhlahlandlela
Chaza ukubambezeleka, ikhwalithi, nezindleko ezihlosiwe ngaphambi kokuqaliswa.
Chaza ukubambezeleka, ikhwalithi, nezindleko ezihlosiwe ngaphambi kokuqaliswa. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.
Ibhentshimakhi ngaphansi komthwalo wangempela nezimo zedatha.
Ibhentshimakhi ngaphansi komthwalo wangempela nezimo zedatha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.
Ukuqapha amathuluzi amaphutha, ukukhukhuleka, nomthelela wabasebenzisi.
Ukuqapha amathuluzi amaphutha, ukukhukhuleka, nomthelela wabasebenzisi. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.
Lungiselela izindlela zokuhlehlisa nezigameko ngaphambi kokukala.
Lungiselela izindlela zokuhlehlisa nezigameko ngaphambi kokukala. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.