UMHLAHLANDLELA WE-AI womsindo

I-WaveGlow Flow-based Vocoder

I-WaveGlow iyivowuda ye-neural esekelwe ku-flow-based evela ku-NVIDIA ehlanganisa amaza enkulumo kusuka kuma-mel-spectrograms ngokudlula okukodwa ngaphandle kokuhlehla.

Uhlolojikelele

I-WaveGlow iyivowuda ye-neural esekelwe ku-flow-based evela ku-NVIDIA ehlanganisa amaza enkulumo kusuka kuma-mel-spectrograms ngokudlula okukodwa ngaphandle kokuhlehla. Ibalulekile ngoba iletha umsindo wekhwalithi ephezulu ngokushesha kunesikhathi sangempela isebenzisa kuphela ukulahleka okungenzeka kube lula.

I-WaveGlow Flow-Based Vocoder ihlezi ku-audio-AI workflows eguqula inkulumo, umculo, nomsindo wokuxhumana, ukufinyeleleka, nokukhiqizwa kwemidiya.

I-Deep Dive

I-WaveGlow, ekhishwe u-Prenger, u-Valle, noCatanzaro e-NVIDIA ngo-2018, ihlanganisa imibono evela ku-Glow ne-WaveNet ukuze kwakhiwe ivokhoda esheshayo futhi elula ukuyiqeqesha. Ngokungafani namavokhoda e-GAN, iwukugeleza okuvamile: ifunda imephu engaguquki phakathi kokusabalalisa okulula kwe-Gaussian kanye nefomethi yegagasi yomsindo, efakwe kusimo se-mel-spectrogram. Ukuqeqeshwa kukhulisa amathuba okuba khona kwelogi yedatha, ngakho-ke akudingi ukucwasa okuhlukile, akukho ukuhlehla okuzenzakalelayo, futhi akukho kusetshenziswa i-distillation yothisha nabafundi ababili benethiwekhi obekudingeka izindlela zangaphambili ezihambisana ne-WaveNet. Ukuze ukhiqize umsindo wenza isampula yomsindo we-Gaussian futhi usebenzise inethiwekhi eguquguqukayo ngokuhlehla. I-WaveGlow ikhiqiza inkulumo yekhwalithi eqhathaniswa ne-WaveNet ngenkathi ihlanganisa ngokushesha kakhulu kunesikhathi sangempela ku-GPU yesimanje.

I-Technical Insight

I-WaveGlow inqwabelanisa izinyathelo zokugeleza ezingaguquki, ngasinye sihlanganisa isendlalelo esihambisanayo esihambisanayo ne-convolution engu-1x1 engaguquki ebolekwe ku-Glow. Amasampuli omsindo aqoqwe abe ama-vector ngokusebenzisa umsebenzi wokuminyanisa ukuze izendlalelo ezihlanganisayo zikwazi ukuziguqula kahle. Ngoba sonke isinyathelo asiguquki, indlela eya phambili ibala amathuba okuqeqeshwa futhi isikhombisi-ndlela esihlehlayo sibeka umsindo kumsindo ukuze siqonde. Inethiwekhi eyodwa kanye nenhloso eyodwa engemihle yokuba nethuba lokungena kwenza ukuqeqeshwa kuzinze futhi kube lula.

I-Mastering WaveGlow Flow-based Vocoder

I-WaveGlow iyivowuda ye-neural esekelwe ku-flow-based evela ku-NVIDIA ehlanganisa amaza enkulumo kusuka kuma-mel-spectrograms ngokudlula okukodwa ngaphandle kokuhlehla. Ibalulekile ngoba iletha umsindo wekhwalithi ephezulu ngokushesha kunesikhathi sangempela isebenzisa kuphela ukulahleka okungenzeka kube lula. I-WaveGlow Flow-Based Vocoder ihlezi ku-audio-AI workflows eguqula inkulumo, umculo, nomsindo wokuxhumana, ukufinyeleleka, nokukhiqizwa kwemidiya. Ukuze wakhe ukuqonda okujulile, phatha i-WaveGlow Flow-Based Vocoder njengemodeli yokusebenza, hhayi isici esisodwa: chaza imiphumela oyifunayo, ucacise ukucabanga, futhi uhlukanise lokho isistimu engakwenza ngokwethembeka kulokho okusadinga ukwahlulela kochwepheshe.

Empeleni, amaqembu aqinile asebenzisa i-WaveGlow Flow-Based Vocoder aphatha ikhwalithi, ukubambezeleka, kanye nemvume njengezingxenye ezibalulekile ngokulinganayo zesu lokuthumela. Babhala imibandela yempumelelo ecacile, ukuhlola okuqhathaniswa nedatha engokoqobo nokugeleza komsebenzi, futhi baphindaphinde ngokusekelwe kumaphethini okuhluleka aqashiwe esikhundleni sokuwina kwebhentshimakhi yesikhathi esisodwa. Yilapho ukuqonda kwethiyori kuguquka kube amandla ahlala njalo kuwo wonke umkhiqizo, inqubomgomo, kanye nokusebenza.

Ithuthukisa ukufinyeleleka ngokuloba, ukulandisa, nezixhumi ezibonakalayo zezwi. Ngesikhathi esifanayo, ukusetshenziswa kabi kwezwi kanye nezingozi zokuzenza ongeyena ziyakhuphuka uma imvume ingekho. Indlela eqine kakhulu iwukuhlanganisa isivinini sokuhlola nesiyalo sokuphatha: qhuba abashayeli bezindiza, bamba ubufakazi, ushicilele amalogi ezinqumo, futhi ubuyekeze izivikelo ngokuqhubekayo njengoba imodeli yokuziphatha, okulindelwe ngabasebenzisi, kanye nezimfuneko zokulawula zishintsha.

I-Strategic Impact

Ithuthukisa ukufinyeleleka ngokuloba, ukulandisa, nezixhumi ezibonakalayo zezwi.

Ithuthukisa ukufinyeleleka ngokuloba, ukulandisa, nezixhumi ezibonakalayo zezwi. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Amaqembu emidiya angathumela umsindo opholishiwe ngokushesha ngamabhajethi amancane.

Amaqembu emidiya angathumela umsindo opholishiwe ngokushesha ngamabhajethi amancane. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Amasistimu abhekene nekhasimende angacubungula ukusebenzelana okukhulunyiwe ngesilinganiso esikhulu.

Amasistimu abhekene nekhasimende angacubungula ukusebenzelana okukhulunyiwe ngesilinganiso esikhulu. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Ikusasa le-WaveGlow Flow-based Vocoder

I-WaveGlow ibonise ukuthi ama-vocoder ahlanzekile angaqhudelana nekhwalithi ezenzakalelayo, abe nomthelela wokugeleza kwakamuva namamodeli omsindo ahambisana nokugeleza. Ubulula bayo bokulahlekelwa okukodwa buhlala bukhanga, nakuba ama-vocoder e-GAN afana ne-HiFi-GAN manje evame ukunqoba ngosayizi nangesivinini. Uma ubheka phambili, imibono esuselwe ekugelezeni kanye nokuhambisana nokugeleza iyabuya ku-TTS yesimanje esondelene nokusabalalisa, futhi imiklamo engaguquki yesitayela se-WaveGlow iyaqhubeka nokwazisa ucwaningo mayelana nokungenzeka kwangempela, ukulawuleka, kanye nesizukulwane se-waveform esisebenzayo.

Ukuqaliswa Komhlaba Wangempela

Imataniswa ne-Tacotron 2 epayipini le-TTS eliyinkomba le-NVIDIA ukuze kukhiqizwe inkulumo yekhwalithi yesitudiyo yemvelo

Ukuhlanganiswa kwenkulumo ye-GPU esheshayo yokulandisa, ukukopisha, nokugeleza komsebenzi wokudala okuqukethwe

Ukukhiqiza ukuqeqeshwa nomsindo wedemo ocwaningweni lapho ukuqeqeshwa okuzinzile, kokulahlekelwa okukodwa kukhethwa khona

Okukhipha izwi kwesikhathi sangempela kumasistimu asebenzisanayo asebenza ngehadiwe ye-NVIDIA

Amaphethini Okusebenzisa

I-WaveGlow Flow-based Vocoder iyasebenza

Imataniswa ne-Tacotron 2 epayipini le-TTS eliyinkomba le-NVIDIA ukuze kukhiqizwe inkulumo yekhwalithi yesitudiyo yemvelo.

Ukumatanisa ne-Tacotron 2 epayipini le-TTS eliyisethenjwa le-NVIDIA ukukhiqiza Amathimba enkulumo yekhwalithi yesitudiyo yemvelo ngokuvamile athola imiphumela engcono lapho echaza imingcele yekhwalithi ngaphambili, egcina indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-WaveGlow Flow-based Vocoder iyasebenza

Ukuhlanganiswa kwenkulumo ye-GPU esheshayo yokulandisa, ukukopisha, nokugeleza komsebenzi wokudala okuqukethwe.

Ukuhlanganiswa kwenkulumo ye-GPU esheshayo yokulandisa, ukukopisha, kanye nokugeleza kokusebenza kokudalwa kokuqukethwe Amaqembu ngokuvamile athola imiphumela engcono uma echaza imingcele yekhwalithi ngaphambili, agcina indlela yokukhuphuka yomuntu yamacala asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-WaveGlow Flow-based Vocoder iyasebenza

Ukukhiqiza ukuqeqeshwa nomsindo wedemo ocwaningweni lapho ukuqeqeshwa okuzinzile, kokulahlekelwa okukodwa kukhethwa khona.

Ukukhiqiza ukuqeqeshwa nomsindo wedemo ocwaningweni lapho kukhethwa khona ukuqeqeshwa okuzinzile, kokulahlekelwa okukodwa Amathimba ngokuvamile athola imiphumela engcono uma echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-WaveGlow Flow-based Vocoder iyasebenza

Okukhipha izwi kwesikhathi sangempela kumasistimu asebenzisanayo asebenza ngehadiwe ye-NVIDIA.

Okukhipha izwi kwesikhathi sangempela kumasistimu asebenzisanayo asebenza ku-NVIDIA hardware Teams ngokuvamile athola imiphumela engcono uma echaza ikhwalithi ephezulu ngaphambili, agcine indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Izingozi & Guardrails

!

Ukusetshenziswa kabi kwezwi kanye nezingozi zokuzenza ongeyena ziyanda uma imvume ingekho.

!

Ukunemba kungase kwehle kuzo zonke izinhlobo zokuphimisela, izilimi zesigodi, noma izindawo ezinomsindo.

!

Umsindo wokwenziwa ungenziwa iphutha njengenkulumo eyiqiniso ngaphandle kokulebula okucacile.

Ukuqalisa Umhlahlandlela

1

Thola imvume esobala yokuthwebula izwi, ukuhlanganisa, nokusebenzisa kabusha.

Thola imvume esobala yokuthwebula izwi, ukuhlanganisa, nokusebenzisa kabusha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

2

Ikhwalithi yokuhlola kuzo zonke izipikha nezimo zangemuva.

Ikhwalithi yokuhlola kuzo zonke izipikha nezimo zangemuva. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

3

Chaza ukuthi kunini lapho umuntu kufanele abuyekeze noma agunyaze okuphumayo.

Chaza ukuthi kunini lapho umuntu kufanele abuyekeze noma agunyaze okuphumayo. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

4

Lebula umsindo wokwenziwa futhi ugcine amarekhodi atholakalayo ukuze aziphendulele.

Lebula umsindo wokwenziwa futhi ugcine amarekhodi atholakalayo ukuze aziphendulele. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

Qhubeka Uhlole