Audio AI GUIDE

WaveGlow Flow-Based Vocoder

WaveGlow inoyerera-yakavakirwa neural vocoder kubva kuNVIDIA iyo inogadzirisa mafungu ekutaura kubva kumel-spectrograms mune imwechete pass pasina autoregression.

Overview

WaveGlow inoyerera-yakavakirwa neural vocoder kubva kuNVIDIA iyo inogadzirisa mafungu ekutaura kubva kumel-spectrograms mune imwechete pass pasina autoregression. Izvo zvine basa nekuti inoburitsa emhando yepamusoro odhiyo nekukurumidza kupfuura nguva chaiyo uchishandisa chete nyore kurasikirwa.

WaveGlow Flow-Yakavakirwa Vocoder inogara muodhiyo-AI workflows inoshandura kutaura, mimhanzi, uye ruzha rwekutaurirana, kuwanikwa, uye kugadzirwa kwenhau.

Deep Dive

WaveGlow, yakaburitswa naPrenger, Valle, uye Catanzaro kuNVIDIA muna 2018, inosanganisa mazano kubva kuGlow neWaveNet kuvaka vocoder inokurumidza uye iri nyore kudzidzisa. Kusiyana nemavokodha eGAN, inoyerera yakajairwa: inodzidza mepu isingachinjiki pakati pekugovera kwakapusa kweGaussian uye odhiyo waveform, yakarongedzwa pane mel-spectrogram. Kudzidzira kunowedzera iyo chaiyo log-mukana weiyo data, saka haidi akaparadzana akaparadzana, hapana auto-regression, uye hapana maviri-network mudzidzisi-mudzidzi distillation iyo yapfuura yakafanana WaveNet nzira dzinodiwa. Kuti ugadzire odhiyo iwe unotevedzera Gaussian ruzha uye mhanyisa iyo invertible network kumashure. WaveGlow inogadzira kutaura kwemhando inofananidzwa neWaveNet uku ichigadzira nekukurumidza kupfuura nguva chaiyo paGPU yemazuva ano.

Technical Insight

WaveGlow inorongedza invertible kuyerera matanho, imwe neimwe ichibatanidza affine coupling layer ine invertible 1x1 convolution yakakweretwa kubva kuGlow. Maodhiyo sampuli akaunganidzwa kuita mavheji kuburikidza nekudzvanya mashandiro kuitira kuti kubatanidza maseru kuashandure nemazvo. Nekuti nhanho yega yega haichinjike, iyo yekumberi inotungamira mukana wekudzidziswa uye inodzokera kumashure inoburitsa ruzha kune odhiyo yekufungidzira. Inetiweki imwe chete uye imwe isina kunaka log-inogoneka chinangwa inoita kuti kudzidziswa kugadzikane uye kuve nyore.

Mastering WaveGlow Flow-Based Vocoder

WaveGlow inoyerera-yakavakirwa neural vocoder kubva kuNVIDIA iyo inogadzirisa mafungu ekutaura kubva kumel-spectrograms mune imwechete pass pasina autoregression. Izvo zvine basa nekuti inoburitsa emhando yepamusoro odhiyo nekukurumidza kupfuura nguva chaiyo uchishandisa chete nyore kurasikirwa. WaveGlow Flow-Yakavakirwa Vocoder inogara muodhiyo-AI workflows inoshandura kutaura, mimhanzi, uye ruzha rwekutaurirana, kuwanikwa, uye kugadzirwa kwenhau. Kuti uvake kunzwisisa kwakadzama, bata WaveGlow Flow-Based Vocoder semuenzaniso wekushandisa, kwete chinhu chimwe chete: tsanangura zvinodiwa, kujekesa fungidziro, uye patsanura izvo zvinogona kuitwa nehurongwa hwakavimbika kubva kune zvichiri kuda kutonga kwenyanzvi.

Mukuita, zvikwata zvakasimba zvinoshandisa WaveGlow Flow-Based Vocoder zvinobata mhando, latency, uye mvumo sezvikamu zvakakosha zvakaenzana zvehurongwa hwekuendesa. Ivo vanonyora zvakajeka maitiro ebudiriro, bvunzo vachipokana ne data rechokwadi uye mafambiro ebasa, uye iterate zvichibva pane zvakacherechedzwa maitiro ekutadza kwete kuhwina-nguva imwe chete yebhenji. Apa ndipo apo kunzwisisa kwe theoretical kunoshanduka kuve kugona kwakasimba pane chigadzirwa, mutemo, uye mashandiro.

Inonatsiridza kusvikika kuburikidza nekunyora, kurondedzera, uye mazwi ekubatanidza. Panguva imwecheteyo, kusashandiswa kweIzwi zvisizvo uye njodzi dzekuedzesera dzinowedzera kana chibvumirano chisipo. Nzira yakatsiga ndeyekubatanidza kukurumidza kuyedza nekutonga: mhanyisa vatyairi vendege, tora humbowo, buritsa matanda esarudzo, uye urambe uchivandudza chengetedzo semaitiro emuenzaniso, zvinotarisirwa nemushandisi, uye zvinodikanwa zvekutonga.

Strategic Impact

Inonatsiridza kusvikika kuburikidza nekunyora, kurondedzera, uye mazwi ekubatanidza.

Inonatsiridza kusvikika kuburikidza nekunyora, kurondedzera, uye mazwi ekubatanidza. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Zvikwata zveMedia zvinogona kutumira odhiyo yakakwenenzverwa nekukurumidza nemabhajeti madiki.

Zvikwata zveMedia zvinogona kutumira odhiyo yakakwenenzverwa nekukurumidza nemabhajeti madiki. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Masisitimu anotarisana nevatengi anogona kugadzirisa kutaurirana kwekutaura pamwero mukuru.

Masisitimu anotarisana nevatengi anogona kugadzirisa kutaurirana kwekutaura pamwero mukuru. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Ramangwana reWaveGlow Flow-Yakavakirwa Vocoder

WaveGlow yakaratidza kuti yakachena kuyerera mavhokodha anogona kukwikwidza autoregressive mhando, ichipesvedzera gare gare kuyerera uye kuyerera-kunoenderana odhiyo modhi. Kureruka kwaro-kurasikirwa kunoramba kuchifadza, kunyange GAN vocoders seHiFi-GAN ikozvino inowanzokunda pakukura nekumhanya. Kutarisa kumberi, kuyerera-kwakavakirwa uye kuyerera-kufananidza mazano ari kusimuka mune yemazuva ano kupararira-yakatarisana neTTS, uye WaveGlow-maitiro invertible madhizaini anoenderera mberi nekuzivisa tsvakiridzo pane chaiyo-inogoneka, inodzoreka, uye inoshanda waveform chizvarwa.

Real-World Implementation

Kubatanidza neTacotron 2 muNVIDIA's reference TTS pombi yekugadzira yakasikwa studio-yemhando yekutaura

Yekukurumidza GPU yekutaura synthesis yekurondedzera, dubbing, uye zvemukati kugadzira workflows

Kugadzira kudzidziswa uye demo odhiyo mukutsvagisa uko kwakagadzikana, kumwe-kurasikirwa kudzidziswa kunosarudzwa

Chaiyo-nguva-inokwanisa kuburitsa izwi mune inodyidzana masisitimu anomhanya paNVIDIA hardware

Maitiro Ekuita

WaveGlow Flow-Yakavakirwa Vocoder mukuita

Kubatanidza neTacotron 2 muNVIDIA's reference TTS pombi yekugadzira yakasikwa studio-yemhando yekutaura.

Pairing neTacotron 2 muNVIDIA's referensi TTS pombi yekugadzira yakasarudzika studio-yemhando yekutaura Matimu anowanzo kuwana zvirinani mhedzisiro kana achinge atsanangura emhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa uye mutengo wekukanganisa nekufamba kwenguva.

WaveGlow Flow-Yakavakirwa Vocoder mukuita

Yekukurumidza GPU yekutaura synthesis yekurondedzera, dubbing, uye zvemukati kugadzira workflows.

Yekukurumidza GPU yekutaura synthesis yekurondedzera, kudhivha, uye kugadzira zvemukati mafambiro Matimu anowanzo kuwana mhedzisiro iri nani kana achinge atsanangura emhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa uye mutengo wekukanganisa nekufamba kwenguva.

WaveGlow Flow-Yakavakirwa Vocoder mukuita

Kugadzira kudzidziswa uye demo odhiyo mukutsvagisa uko kwakagadzikana, kumwe-kurasikirwa kudzidziswa kunosarudzwa.

Kugadzira kudzidziswa uye demo odhiyo mukutsvagisa uko kwakagadzikana, kumwechete-kurasikirwa kudzidziswa kunofarirwa Matimu anowanzo kuwana mhedzisiro iri nani kana achinge atsanangura emhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa nemitengo yekukanganisa nekufamba kwenguva.

WaveGlow Flow-Yakavakirwa Vocoder mukuita

Chaiyo-nguva-inokwanisa izwi kubuda mumashandisirwo masisitimu anomhanya paNVIDIA hardware.

Chaiyo-nguva-inokwanisa kuburitsa izwi mumasisitimu anodyidzana anomhanya paNVIDIA Hardware Matimu anowanzo kuwana zvirinani kana ivo vachitsanangudza zvemhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye tarisa zvese zvakawanikwa zvechigadzirwa uye mutengo wekukanganisa nekufamba kwenguva.

Njodzi & Guardrails

!

Kushandisa izwi zvisizvo uye njodzi dzekuedzesera dzinowedzera kana chibvumirano chisipo.

!

Kururama kunogona kudonha mumitauro, mataurirwo, kana nharaunda dzine ruzha.

!

Synthetic audio inogona kukanganisa kutaura kwechokwadi isina mavara akajeka.

Implementation Roadmap

1

Wana mvumo yakajeka yekutora inzwi, kugadzira, uye kushandisa zvakare.

Wana mvumo yakajeka yekutora inzwi, kugadzira, uye kushandisa zvakare. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

2

Yedza mhando pavatauri vakasiyana uye mamiriro ekumashure.

Yedza mhando pavatauri vakasiyana uye mamiriro ekumashure. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

3

Tsanangura apo munhu anofanira kuongorora kana kubvumidza zvabuda.

Tsanangura apo munhu anofanira kuongorora kana kubvumidza zvabuda. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

4

Label synthetic odhiyo uye chengetedza marekodhi ekuzvidavirira.

Label synthetic odhiyo uye chengetedza marekodhi ekuzvidavirira. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

Ramba Uchiongorora