Uhlolojikelele
I-DeepSpeech imodeli yokuqaphela inkulumo kusukela ekupheleni ukuya ekupheleni eyethulwe ngu-Baidu ngo-2014 ebeka imephu izici zomsindo ongahluziwe ngokuqondile embhalweni kusetshenziswa inethiwekhi ye-neural eqhubekayo eqeqeshwe ngokulahleka kwe-CTC. Kusize ukusungula amapayipi e-ASR ayinkimbinkimbi, akhiwe ngesandla aye kumasistimu afundiwe, aqhutshwa yidatha.
I-DeepSpeech Architecture ihlezi ku-audio-AI workflows eguqula inkulumo, umculo, nomsindo wokuxhumana, ukufinyeleleka, nokukhiqizwa kwemidiya.
I-Deep Dive
Iziboni zenkulumo zakudala zihlanganise amamodeli ahlukene e-acoustic, izichazamazwi zokuphimisa, namamodeli olimi anezingxenye ezishunwe ngesandla. I-DeepSpeech imiselele iningi lalokho ngenethiwekhi eyodwa ye-neural eqeqeshelwe ukuphela kuze kube sekupheleni. Isakhiwo sayo sithatha izici ze-spectrogram noma ze-MFCC ngaphezu kozimele bomsindo abafushane futhi iwaphakele ngezendlalelo ezimbalwa ezixhunywe ngokugcwele, isendlalelo esiphindaphindayo esinezinhlangothi ezimbili esithwebula okuqukethwe kusukela esikhathini esidlule nesizayo, kanye nosendlalelo okukhiphayo okhiqiza amathuba okusabalalisa phezu kwezinhlamvu esinyathelweni ngasinye sesikhathi. Okubaluleke kakhulu, isebenzisa i-Connectionist Temporal Classification (CTC), evumela inethiwekhi ukuthi ifunde ukuqondanisa phakathi komsindo nombhalo ngaphandle kokudinga amalebula ngeleveli yozimele. I-Mozilla kamuva yakhipha ukusetshenziswa komthombo ovulekile okudumile (nezinguqulo ezintsha zisebenzisa i-LSTM-based, idizayini esakazwayo), okwenza indlela ifinyeleleke kabanzi.
I-Technical Insight
Isici esiyinhloko ukulahlekelwa kwe-CTC. Inkulumo nombhalo akuhambisani nohlaka ngohlaka, ngakho i-CTC yethula uphawu 'olungenalutho' kanye nezibalo phezu kwakho konke ukuqondanisa okungaba khona okugoqeka ekulobeni okuqondiwe. Lokhu kuvumela imodeli ukuthi ikhiphe uhlamvu ngesinyathelo sesikhathi ngasinye futhi ifunde lapho imisindo ihambisa khona izinhlamvu ngokuzenzakalelayo. I-RNN ye-bidirectional inikeza ukufinyelela kokuqagela ngakunye kumongo ozungezile we-acoustic, futhi imodeli yangaphandle yolimi lwe-n-gram ivamise ukungezwa ngesikhathi sokunquma ukuze kuthuthukiswe isipelingi nokukhetha amagama.
I-Mastering DeepSpeech Architecture
I-DeepSpeech imodeli yokuqaphela inkulumo kusukela ekupheleni ukuya ekupheleni eyethulwe ngu-Baidu ngo-2014 ebeka imephu izici zomsindo ongahluziwe ngokuqondile embhalweni kusetshenziswa inethiwekhi ye-neural eqhubekayo eqeqeshwe ngokulahleka kwe-CTC. Kusize ukusungula amapayipi e-ASR ayinkimbinkimbi, akhiwe ngesandla aye kumasistimu afundiwe, aqhutshwa yidatha. I-DeepSpeech Architecture ihlezi ku-audio-AI workflows eguqula inkulumo, umculo, nomsindo wokuxhumana, ukufinyeleleka, nokukhiqizwa kwemidiya. Ukuze wakhe ukuqonda okujulile, phatha i-DeepSpeech Architecture njengemodeli yokusebenza, hhayi isici esisodwa: chaza imiphumela efiselekayo, ucacise ukucabanga, futhi uhlukanise lokho uhlelo olungakwenza ngokwethembeka kulokho okusadinga ukwahlulela kochwepheshe.
Empeleni, amaqembu aqinile asebenzisa i-DeepSpeech Architecture aphatha ikhwalithi, ukubambezeleka, kanye nemvume njengezingxenye ezibalulekile ngokulinganayo zesu lokuthunyelwa. Babhala imibandela yempumelelo ecacile, ukuhlola okuqhathaniswa nedatha engokoqobo nokugeleza komsebenzi, futhi baphindaphinde ngokusekelwe kumaphethini okuhluleka aqashiwe esikhundleni sokuwina kwebhentshimakhi yesikhathi esisodwa. Yilapho ukuqonda kwethiyori kuguquka kube amandla ahlala njalo kuwo wonke umkhiqizo, inqubomgomo, kanye nokusebenza.
Ithuthukisa ukufinyeleleka ngokuloba, ukulandisa, nezixhumi ezibonakalayo zezwi. Ngesikhathi esifanayo, ukusetshenziswa kabi kwezwi kanye nezingozi zokuzenza ongeyena ziyakhuphuka uma imvume ingekho. Indlela eqine kakhulu iwukuhlanganisa isivinini sokuhlola nesiyalo sokuphatha: qhuba abashayeli bezindiza, bamba ubufakazi, ushicilele amalogi ezinqumo, futhi ubuyekeze izivikelo ngokuqhubekayo njengoba imodeli yokuziphatha, okulindelwe ngabasebenzisi, kanye nezimfuneko zokulawula zishintsha.
I-Strategic Impact
Ithuthukisa ukufinyeleleka ngokuloba, ukulandisa, nezixhumi ezibonakalayo zezwi.
Ithuthukisa ukufinyeleleka ngokuloba, ukulandisa, nezixhumi ezibonakalayo zezwi. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.
Amaqembu emidiya angathumela umsindo opholishiwe ngokushesha ngamabhajethi amancane.
Amaqembu emidiya angathumela umsindo opholishiwe ngokushesha ngamabhajethi amancane. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.
Amasistimu abhekene nekhasimende angacubungula ukusebenzelana okukhulunyiwe ngesilinganiso esikhulu.
Amasistimu abhekene nekhasimende angacubungula ukusebenzelana okukhulunyiwe ngesilinganiso esikhulu. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.
Ukuqaliswa Komhlaba Wangempela
Okungaxhunyiwe ku-inthanethi, ukubonwa komyalo wezwi kudivayisi yezinhlelo zokusebenza ezigxile ekuyimfihlo zisebenzisa i-DeepSpeech evulekile ye-Mozilla
Ikhiqiza okulotshiwe okusalungiswa kwamaphodikasti noma izifundo ngaphandle kokuthembela kusevisi yefu
Ukufundisa okuyisisekelo kokulahleka kwe-ASR yokuphela ukuya ekupheleni kanye ne-CTC ezifundweni zokufunda ngomshini zasenyuvesi
Ukwakha izixhumanisi zezwi zangokwezifiso ze-IoT noma amadivayisi ashumekiwe lapho kudingeka khona isiboni esilula, esisakazekayo
Amaphethini Okusebenzisa
I-DeepSpeech Architecture in practice
Okungaxhunyiwe ku-inthanethi, ukubonwa komyalo wezwi kudivayisi yezinhlelo zokusebenza ezigxile ekuyimfihlo zisebenzisa i-DeepSpeech evulekile ye-Mozilla.
Okungaxhunyiwe ku-inthanethi, ukuqashelwa komyalo wezwi kudivayisi yezinhlelo zokusebenza ezigxile kubumfihlo ezisebenzisa Amaqembu e-DeepSpeech avuliwe e-Mozilla ngokuvamile zithola imiphumela engcono uma echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.
I-DeepSpeech Architecture in practice
Ikhiqiza okulotshiwe okusalungiswa kwamaphodikasti noma izifundo ngaphandle kokuthembela kusevisi yefu.
Ukukhiqiza okulotshiwe okusalungiswa kwamaphodikasti noma izinkulumo ngaphandle kokuthembela kusevisi yefu Amathimba ngokuvamile athola imiphumela engcono uma echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.
I-DeepSpeech Architecture in practice
Ukufundisa okuyisisekelo kokulahleka kwe-ASR ekupheleni kuya ekupheleni kanye ne-CTC ezifundweni zokufunda ngomshini zasenyuvesi.
Ukufundisa okuyisisekelo kokulahlekelwa kwe-ASR nokuphela kwe-ASR kanye ne-CTC ezifundweni zokufunda ngomshini zasenyuvesi Amaqembu ngokuvamile athola imiphumela engcono lapho echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka kwabantu yamacaci asemaphethelweni, futhi elandelela kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.
I-DeepSpeech Architecture in practice
Ukwakha izixhumanisi zezwi zangokwezifiso ze-IoT noma amadivayisi ashumekiwe lapho kudingeka khona isiboni esilula, esisakazekayo.
Ukwakha izixhumanisi zezwi zangokwezifiso ze-IoT noma amadivayisi ashumekiwe lapho kudingeka khona isiboni esilula, esisakazwayo Amaqembu ngokuvamile athola imiphumela engcono lapho echaza ikhwalithi ephezulu ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.
Izingozi & Guardrails
Ukusetshenziswa kabi kwezwi kanye nezingozi zokuzenza ongeyena ziyanda uma imvume ingekho.
Ukunemba kungase kwehle kuzo zonke izinhlobo zokuphimisela, izilimi zesigodi, noma izindawo ezinomsindo.
Umsindo wokwenziwa ungenziwa iphutha njengenkulumo eyiqiniso ngaphandle kokulebula okucacile.
Ukuqalisa Umhlahlandlela
Thola imvume esobala yokuthwebula izwi, ukuhlanganisa, nokusebenzisa kabusha.
Thola imvume esobala yokuthwebula izwi, ukuhlanganisa, nokusebenzisa kabusha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.
Ikhwalithi yokuhlola kuzo zonke izipikha nezimo zangemuva.
Ikhwalithi yokuhlola kuzo zonke izipikha nezimo zangemuva. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.
Chaza ukuthi kunini lapho umuntu kufanele abuyekeze noma agunyaze okuphumayo.
Chaza ukuthi kunini lapho umuntu kufanele abuyekeze noma agunyaze okuphumayo. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.
Lebula umsindo wokwenziwa futhi ugcine amarekhodi atholakalayo ukuze aziphendulele.
Lebula umsindo wokwenziwa futhi ugcine amarekhodi atholakalayo ukuze aziphendulele. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.