UMHLAHLANDLELA WE-AI womsindo

I-AudioLM

I-AudioLM iwuhlaka Google locwaningo olukhiqiza umsindo wangempela — inkulumo noma umculo wepiyano — ngokuphatha umsindo njengolimi nokuwubikezela ithokheni.

Uhlolojikelele

I-AudioLM iwuhlaka Google locwaningo olukhiqiza umsindo wangempela — inkulumo noma umculo wepiyano — ngokuphatha umsindo njengolimi nokuwubikezela ithokheni. Kubalulekile ngoba kukhombisile ukuthi ungakhiqiza umsindo ohambisanayo, ozwakalayo wemvelo ngaphandle kokulotshiweyo kombhalo noma amaphuzu omculo.

I-AudioLM ihlezi ku-audio-AI workflows eguqula inkulumo, umculo, nomsindo wokuxhumana, ukufinyeleleka, nokukhiqizwa kwemidiya.

I-Deep Dive

Yethulwe yi-Google ngo-2022, i-AudioLM ihlela kabusha ukukhiqizwa komsindo njengenkinga yokufanisa ulimi: iguqula amagagasi aluhlaza abe amathokheni ahlukene bese ibikezela ithokheni elandelayo, njengoba nje imodeli yombhalo ibikezela igama elilandelayo. Iqhinga layo eliyinhloko wukulandelana kwezinhlobo zamathokheni. Amathokheni e-'Semantic' (kusuka kumodeli efana ne-w2v-BERT) athwebula isakhiwo sesikhathi eside - ifonetiki, i-syntax, umculo - kuyilapho amathokheni 'acoustic' (kusuka ku-SoundStream neural codec) athwebula imininingwane emihle njengobunikazi besipika, i-timbre, nezimo zokuqopha. Ngokubikezela kuqala amathokheni e-semantic, bese ubeka amathokheni e-acoustic kuwo, i-AudioLM ikhiqiza ukuqhubeka okuhlala kuhlangene imizuzwana eminingi kuyilapho ilondoloza izwi lokuqala noma insimbi. Inikezwe imizuzwana yokukhuluma, iyaqhubeka ikhuluma ngezwi elifanayo; ipiyano enikezwe, ithuthuka ngesitayela esifanayo.

I-Technical Insight

I-AudioLM iqeqeshelwa umsindo kuphela - akukho okulotshiwe. I-SoundStream icindezela umsindo ube amathokheni e-acoustic ngokusebenzisa ukulinganisa kwe-vector eyinsalela, kuyilapho i-w2v-BERT inikeza amathokheni e-semantic amaholo. Inqwaba yamamodeli olimi lwe-Transformer ibikezela amathokheni ngezigaba: i-semantic kuqala ngesakhiwo, bese kuba amathokheni amaholoholo namathokheni amahle e-acoustic yokwakha kabusha ukwethembeka okuphezulu. Idekhoda ye-SoundStream igcina iguqule amathokheni abikezelwe ukuthi abe yi-waveform, ekhipha umsindo ogcina izwi lesipikha ne-prosody ingashintshi.

Ubuchwepheshe be-AudioLM

I-AudioLM iwuhlaka Google locwaningo olukhiqiza umsindo wangempela — inkulumo noma umculo wepiyano — ngokuphatha umsindo njengolimi nokuwubikezela ithokheni. Kubalulekile ngoba kukhombisile ukuthi ungakhiqiza umsindo ohambisanayo, ozwakalayo wemvelo ngaphandle kokulotshiweyo kombhalo noma amaphuzu omculo. I-AudioLM ihlezi ku-audio-AI workflows eguqula inkulumo, umculo, nomsindo wokuxhumana, ukufinyeleleka, nokukhiqizwa kwemidiya. Ukuze wakhe ukuqonda okujulile, phatha i-AudioLM njengemodeli yokusebenza, hhayi isici esisodwa: chaza imiphumela efiselekayo, ucacise ukucabanga, futhi uhlukanise lokho uhlelo olungakwenza ngokwethembeka kulokho okusadinga ukwahlulela kochwepheshe.

Empeleni, amaqembu aqinile asebenzisa i-AudioLM aphatha ikhwalithi, ukubambezeleka, kanye nemvume njengezingxenye ezibalulekile zesu lokuthunyelwa. Babhala imibandela yempumelelo ecacile, ukuhlola okuqhathaniswa nedatha engokoqobo nokugeleza komsebenzi, futhi baphindaphinde ngokusekelwe kumaphethini okuhluleka aqashiwe esikhundleni sokuwina kwebhentshimakhi yesikhathi esisodwa. Yilapho ukuqonda kwethiyori kuguquka kube amandla ahlala njalo kuwo wonke umkhiqizo, inqubomgomo, kanye nokusebenza.

Ithuthukisa ukufinyeleleka ngokuloba, ukulandisa, nezixhumi ezibonakalayo zezwi. Ngesikhathi esifanayo, ukusetshenziswa kabi kwezwi kanye nezingozi zokuzenza ongeyena ziyakhuphuka uma imvume ingekho. Indlela eqine kakhulu iwukuhlanganisa isivinini sokuhlola nesiyalo sokuphatha: qhuba abashayeli bezindiza, bamba ubufakazi, ushicilele amalogi ezinqumo, futhi ubuyekeze izivikelo ngokuqhubekayo njengoba imodeli yokuziphatha, okulindelwe ngabasebenzisi, kanye nezimfuneko zokulawula zishintsha.

I-Strategic Impact

Ithuthukisa ukufinyeleleka ngokuloba, ukulandisa, nezixhumi ezibonakalayo zezwi.

Ithuthukisa ukufinyeleleka ngokuloba, ukulandisa, nezixhumi ezibonakalayo zezwi. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Amaqembu emidiya angathumela umsindo opholishiwe ngokushesha ngamabhajethi amancane.

Amaqembu emidiya angathumela umsindo opholishiwe ngokushesha ngamabhajethi amancane. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Amasistimu abhekene nekhasimende angacubungula ukusebenzelana okukhulunyiwe ngesilinganiso esikhulu.

Amasistimu abhekene nekhasimende angacubungula ukusebenzelana okukhulunyiwe ngesilinganiso esikhulu. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Ikusasa le-AudioLM

Iresiphi esekelwe kumathokheni ye-AudioLM ibe yisisekelo samasistimu akamuva: GoogleImibono ye-AudioLM efakwe ku-MusicLM yokuguqula umbhalo ube umculo kanye ne-SoundStorm ukuze ikhiqize ngokushesha, kuyilapho inkambu ebanzi manje ihlanganisa amathokheni e-semantic ne-acoustic kuyo yonke inkulumo, umculo, nemisindo. Lindela isizukulwane esisheshayo, sesikhathi sangempela, okuphumayo okuhambisanayo okude, nokulawula izindlela eziningi lapho umbhalo noma amanye amasiginali aqondisa amamodeli aqeqeshwe ngokuzwakalayo. Amasu afanayo aphinde acije ukukhathazeka mayelana ne-voice cloning kanye nama-deepfakes omsindo.

Ukuqaliswa Komhlaba Wangempela

Ukuqhubeka nesiqeshana senkulumo esifushane ezwini elifanayo lesipika nephimbo ngaphandle kokulotshiweyo

Ukuthuthukisa umculo wepiyano omusha ofana nesitayela sokwaziswa okufushane okurekhodiwe

Isebenza njengomgogodla wesizukulwane somsindo kumasistimu ombhalo ukuya kumculo njenge-MusicLM

Ucwaningo ku-synthesis yenkulumo egcina i-prosody nokuqoshwa kwama-acoustics kusuka kusampula

Amaphethini Okusebenzisa

I-AudioLM iyasebenza

Ukuqhubeka nesiqeshana senkulumo esifushane ezwini elifanayo lesipika nephimbo ngaphandle kokulotshiweyo.

Ukuqhubeka nesiqeshana senkulumo esifushane ezwini lesipika elifanayo nokuphimisa ngaphandle kokulotshiwe Amathimba ngokuvamile athola imiphumela engcono uma echaza ikhwalithi ephezulu ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi elandelela kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-AudioLM iyasebenza

Ukuthuthukisa umculo wepiyano omusha ofana nesitayela sokwaziswa okufushane okurekhodiwe.

Ukwenza ngcono umculo wepiyano omusha ofana nesitayela sokwaziswa okufushane okurekhodiwe Amaqembu ngokuvamile athola imiphumela engcono uma echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-AudioLM iyasebenza

Isebenza njengomgogodla wesizukulwane somsindo kumasistimu ombhalo ukuya kumculo njenge-MusicLM.

Ukukhonza njengomgogodla wesizukulwane somsindo wezinhlelo zokuguqula umbhalo ube umculo njengamathimba e-MusicLM ngokuvamile athola imiphumela engcono uma echaza izilinganiso zekhwalithi ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-AudioLM iyasebenza

Ucwaningo ku-synthesis yenkulumo egcina i-prosody nokuqoshwa kwama-acoustics kusuka kusampula.

Ucwaningo olwenziwe ngenkulumo olugcina i-prosody nokuqoshwa kwama-acoustics kwisampula Amathimba ngokuvamile athola imiphumela engcono lapho echaza ikhwalithi ephezulu ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Izingozi & Guardrails

!

Ukusetshenziswa kabi kwezwi kanye nezingozi zokuzenza ongeyena ziyanda uma imvume ingekho.

!

Ukunemba kungase kwehle kuzo zonke izinhlobo zokuphimisela, izilimi zesigodi, noma izindawo ezinomsindo.

!

Umsindo wokwenziwa ungenziwa iphutha njengenkulumo eyiqiniso ngaphandle kokulebula okucacile.

Ukuqalisa Umhlahlandlela

1

Thola imvume esobala yokuthwebula izwi, ukuhlanganisa, nokusebenzisa kabusha.

Thola imvume esobala yokuthwebula izwi, ukuhlanganisa, nokusebenzisa kabusha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

2

Ikhwalithi yokuhlola kuzo zonke izipikha nezimo zangemuva.

Ikhwalithi yokuhlola kuzo zonke izipikha nezimo zangemuva. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

3

Chaza ukuthi kunini lapho umuntu kufanele abuyekeze noma agunyaze okuphumayo.

Chaza ukuthi kunini lapho umuntu kufanele abuyekeze noma agunyaze okuphumayo. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

4

Lebula umsindo wokwenziwa futhi ugcine amarekhodi atholakalayo ukuze aziphendulele.

Lebula umsindo wokwenziwa futhi ugcine amarekhodi atholakalayo ukuze aziphendulele. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

Qhubeka Uhlole