UMHLAHLANDLELA WE-AI womsindo

Jasper futhi QuartzNet ASR

I-Jasper ne-QuartzNet zingamamodeli e-NVIDIA okuqaphela inkulumo eguquguqukayo ukusuka ekupheleni ukuya ekupheleni, kanti i-QuartzNet iwuhlelo oluncane kakhulu, olusebenzayo lokuklama kabusha kwe-Jasper.

Uhlolojikelele

I-Jasper ne-QuartzNet zingamamodeli e-NVIDIA okuqaphela inkulumo eguquguqukayo ukusuka ekupheleni ukuya ekupheleni, kanti i-QuartzNet iwuhlelo oluncane kakhulu, olusebenzayo lokuklama kabusha kwe-Jasper. Zibalulekile ekuboniseni indlela yokuthola ukunemba okuqinile ngamapharamitha ambalwa kakhulu, alungele ukuthunyelwa.

I-Jasper ne-QuartzNet ASR ihlezi ku-audio-AI workflows eguqula inkulumo, umculo, nomsindo wokuxhumana, ukufinyeleleka, nokukhiqizwa kwemidiya.

I-Deep Dive

I-Jasper (Esinye Isazi Senkulumo Nje), ekhishwe yi-NVIDIA ngo-2019, iyinethiwekhi ejulile ye-1D yokuguqula, izendlalelo ezifika kwezingama-54, ebonisa izici ze-mel-spectrogram kubalingiswa abasebenzisa ukulahlekelwa kwe-CTC. Yethula ukuxhumana okusalela okuminyene ukuze ama-gradient ageleze ngokuhlanzekile ezitaki ezijulile kakhulu. I-QuartzNet, ekhishwe ngawo lowo nyaka, yagcina ukwakheka kwebhulokhi ye-Jasper kodwa yashintsha ama-convolutions ajwayelekile ngama-convolutions ahlukanisekayo wesiteshi sesikhathi, ihlukanisa isihlungi ngasinye sibe yi-convolution yesikhashana ejulile kanye nesinyathelo sokuhlanganisa isiteshi esiqondile. Lokhu kufakwa kwezinto kwehlise amapharamitha kusuka ku-Jasper cishe ezigidini ezingama-333 kwehle kuya cishe ezigidini eziyi-19 ngenkathi kuqhathaniswa nokunemba ku-Librispeech. Zombili zihamba ngekhithi yamathuluzi ye-NVIDIA ye-NeMo futhi zivulele ukuqeqeshwa kwe-GPU okusheshayo kanye nencazelo yesikhathi sangempela, okuwenza abe amabhulokhi wokwakha adumile wokukhiqiza i-ASR.

I-Technical Insight

Ukusebenza kahle kwe-QuartzNet kuvela ekuhlanganiseni okuhlukanisekayo kwesiteshi sesikhathi, umqondo ofanayo ngemuva kwe-MobileNet. I-convolution evamile ye-1D ihlanganisa isikhathi namashaneli ndawonye, ​​kubiza u-K izikhathi ezingu-C izikhathi ze-C-out izisindo. Ukuyihlukanisa ibe ukuguquguquka okujulile ngokuhamba kwesikhathi kanye nokuguquguquka okuqondile okungu-1x1 eziteshini kunciphisa amapharamitha ukuze kube izikhathi ezingu-K izikhathi ezingu-C kanye nezikhathi ezingu-C zokuthi C-out. Istakwe kumabhulokhi ayinsalela futhi yaqeqeshwa nge-CTC, lokhu kunikeza ukunemba kwe-Near-Jasper engxenyeni yosayizi wemodeli nokubala.

I-Mastering Jasper kanye ne-QuartzNet ASR

I-Jasper ne-QuartzNet zingamamodeli e-NVIDIA okuqaphela inkulumo eguquguqukayo ukusuka ekupheleni ukuya ekupheleni, kanti i-QuartzNet iwuhlelo oluncane kakhulu, olusebenzayo lokuklama kabusha kwe-Jasper. Zibalulekile ekuboniseni indlela yokuthola ukunemba okuqinile ngamapharamitha ambalwa kakhulu, alungele ukuthunyelwa. I-Jasper ne-QuartzNet ASR ihlezi ku-audio-AI workflows eguqula inkulumo, umculo, nomsindo wokuxhumana, ukufinyeleleka, nokukhiqizwa kwemidiya. Ukuze wakhe ukuqonda okujulile, phatha i-Jasper ne-QuartzNet ASR njengemodeli yokusebenza, hhayi isici esisodwa: chaza imiphumela efiselekayo, ucacise ukucabanga, futhi uhlukanise lokho uhlelo olungakwenza ngokwethembeka kulokho okusadinga ukwahlulela kochwepheshe.

Empeleni, amaqembu aqinile asebenzisa i-Jasper ne-QuartzNet ASR aphatha ikhwalithi, ukubambezeleka, kanye nemvume njengezingxenye ezibalulekile ngokulinganayo zesu lokuthunyelwa. Babhala imibandela yempumelelo ecacile, ukuhlola okuqhathaniswa nedatha engokoqobo nokugeleza komsebenzi, futhi baphindaphinde ngokusekelwe kumaphethini okuhluleka aqashiwe esikhundleni sokuwina kwebhentshimakhi yesikhathi esisodwa. Yilapho ukuqonda kwethiyori kuguquka kube amandla ahlala njalo kuwo wonke umkhiqizo, inqubomgomo, kanye nokusebenza.

Ithuthukisa ukufinyeleleka ngokuloba, ukulandisa, nezixhumi ezibonakalayo zezwi. Ngesikhathi esifanayo, ukusetshenziswa kabi kwezwi kanye nezingozi zokuzenza ongeyena ziyakhuphuka uma imvume ingekho. Indlela eqine kakhulu iwukuhlanganisa isivinini sokuhlola nesiyalo sokuphatha: qhuba abashayeli bezindiza, bamba ubufakazi, ushicilele amalogi ezinqumo, futhi ubuyekeze izivikelo ngokuqhubekayo njengoba imodeli yokuziphatha, okulindelwe ngabasebenzisi, kanye nezimfuneko zokulawula zishintsha.

I-Strategic Impact

Ithuthukisa ukufinyeleleka ngokuloba, ukulandisa, nezixhumi ezibonakalayo zezwi.

Ithuthukisa ukufinyeleleka ngokuloba, ukulandisa, nezixhumi ezibonakalayo zezwi. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Amaqembu emidiya angathumela umsindo opholishiwe ngokushesha ngamabhajethi amancane.

Amaqembu emidiya angathumela umsindo opholishiwe ngokushesha ngamabhajethi amancane. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Amasistimu abhekene nekhasimende angacubungula ukusebenzelana okukhulunyiwe ngesilinganiso esikhulu.

Amasistimu abhekene nekhasimende angacubungula ukusebenzelana okukhulunyiwe ngesilinganiso esikhulu. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Ikusasa le-Jasper ne-QuartzNet ASR

Uhlu lozalo lwe-QuartzNet oluhlukanisekayo luholele ngqo ku-Citrinet ye-NVIDIA kanye namamodeli asetshenziswa kabanzi e-Conformer, anezela ukuzinaka ukuze athwebule umongo womhlaba wonke kanye nezingxoxo zendawo. Lindela umnyakazo oqhubekayo oya kuma-hybrid convolution-plus-attention architectures namadekhoda e-transducer (RNN-T) ukuze usakaze. Isifundo esiwumongo, ama-convolutions okusebenza kahle kwepharamitha onqenqemeni kanye nesikhathi sangempela sokuphakelwa, kuhlala kumaphakathi njengoba i-ASR iphushela kumafoni, ezimotweni, nakumadivayisi ashumekiwe.

Ukuqaliswa Komhlaba Wangempela

Ukuloba kwesikhathi sangempela nabasizi bezwi basetshenziswe kuma-NVIDIA GPU ngekhithi yamathuluzi ye-NeMo

I-Edge kanye ne-ASR eshumekiwe lapho isigxivizo esincane se-QuartzNet silingana namadivayisi anenkumbulo

Ukulungisa kahle izindawo zokuhlola eziqeqeshwe kusengaphambili ze-QuartzNet zamagama aqondene nesizinda esifana nemibandela yezokwelapha noma yezomthetho

Izibalo zesikhungo socingo ezibhala ivolumu enkulu yomsindo ngokushesha futhi ngendlela engabizi kakhulu

Amaphethini Okusebenzisa

I-Jasper ne-QuartzNet ASR isebenza

Ukuloba kwesikhathi sangempela nabasizi bezwi basetshenziswe kuma-NVIDIA GPU ngekhithi yamathuluzi ye-NeMo.

Ukuloba kwesikhathi sangempela nabasizi bezwi abasetshenziswe kuma-NVIDIA GPU ngekhithi yamathuluzi ye-NeMo Amaqembu ngokuvamile athola imiphumela engcono uma echaza ikhwalithi ephezulu ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-Jasper ne-QuartzNet ASR isebenza

I-Edge kanye ne-ASR eshumekiwe lapho isigxivizo esincane se-QuartzNet silingana namadivayisi anenkumbulo.

I-Edge kanye ne-ASR eshumekiwe lapho unyawo lwe-QuartzNet oluncane lulingana namadivayisi anenkumbulo ebambezelekile Amaqembu ngokuvamile athola imiphumela engcono lapho echaza ikhwalithi ephezulu ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-Jasper ne-QuartzNet ASR isebenza

Ukulungisa kahle izindawo zokuhlola eziqeqeshwe kusengaphambili ze-QuartzNet zamagama aqondene nesizinda njengamagama ezokwelapha noma omthetho.

Ukushuna kahle izindawo zokuhlola eziqeqeshwe kusengaphambili ze-QuartzNet zamagama aqondene nesizinda esithile njengamagama ezokwelapha noma omthetho Amaqembu ngokuvamile athola imiphumela engcono uma echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka kwabantu yamakesi asemaphethelweni, futhi elandelela kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-Jasper ne-QuartzNet ASR isebenza

Izibalo zesikhungo socingo ezibhala ivolumu enkulu yomsindo ngokushesha futhi ngendlela engabizi kakhulu.

Izibalo zesikhungo socingo ezibhala ivolumu enkulu yomsindo ngokushesha nangendlela engabizi Amathimba ngokuvamile athola imiphumela engcono uma echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi elandelela kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Izingozi & Guardrails

!

Ukusetshenziswa kabi kwezwi kanye nezingozi zokuzenza ongeyena ziyanda uma imvume ingekho.

!

Ukunemba kungase kwehle kuzo zonke izinhlobo zokuphimisela, izilimi zesigodi, noma izindawo ezinomsindo.

!

Umsindo wokwenziwa ungenziwa iphutha njengenkulumo eyiqiniso ngaphandle kokulebula okucacile.

Ukuqalisa Umhlahlandlela

1

Thola imvume esobala yokuthwebula izwi, ukuhlanganisa, nokusebenzisa kabusha.

Thola imvume esobala yokuthwebula izwi, ukuhlanganisa, nokusebenzisa kabusha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

2

Ikhwalithi yokuhlola kuzo zonke izipikha nezimo zangemuva.

Ikhwalithi yokuhlola kuzo zonke izipikha nezimo zangemuva. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

3

Chaza ukuthi kunini lapho umuntu kufanele abuyekeze noma agunyaze okuphumayo.

Chaza ukuthi kunini lapho umuntu kufanele abuyekeze noma agunyaze okuphumayo. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

4

Lebula umsindo wokwenziwa futhi ugcine amarekhodi atholakalayo ukuze aziphendulele.

Lebula umsindo wokwenziwa futhi ugcine amarekhodi atholakalayo ukuze aziphendulele. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

Qhubeka Uhlole