Uhlolojikelele
I-SoundStorm iyimodeli Google ekhiqiza umsindo ekhiqiza inkulumo nomsindo ngokuhambisana kunethokheni eyodwa ngesikhathi, okwenza ukuhlanganiswa komsindo kwekhwalithi ephezulu kusheshe kakhulu. Ibalulekile ngoba inciphisa ukubambezeleka kwesizukulwane ngeziqeshana ezinde ukusuka kumaminithi ukuya kumasekhondi ngaphandle kokudela ukwethembeka.
I-SoundStorm Parallel Audio Generation ihlala ekuhambeni komsebenzi okulalelwayo-AI okuguqula inkulumo, umculo, nomsindo wokuxhumana, ukufinyeleleka, nokukhiqizwa kwemidiya.
I-Deep Dive
I-SoundStorm, eyethulwe ngu-Google ngo-2023, ikhiqiza umsindo omelwe njengamathokheni ahlukene e-acoustic avela kukhodekhi ye-neural ebizwa ngokuthi i-SoundStream. Amamodeli angaphambilini afana ne-AudioLM akhiqize lawa mathokheni ngokuzenzakalelayo, ebikezela ithokheni ngayinye ngokulandelana, ehamba kancane ngomsindo omude. I-SoundStorm esikhundleni salokho isebenzisa indlela engeyona eye-autoregressive, esekelwe kumaski ebolekwe kumamodeli okukhiqiza izithombe afana ne-MaskGIT. Iqala ngamathokheni amboziwe kakhulu futhi iwagcwalise ngokuphindaphindiwe phezu kwezinyathelo ezimbalwa zokukhipha amakhodi, ibikezela amathokheni amaningi ngesikhathi esisodwa ngokuhambisana. Ifakwe kumathokheni e-semantic (kusuka kumodeli efana ne-AudioLM noma i-SPEAR-TTS), ingahlanganisa imizuzwana engu-30 yengxoxo yemvelo cishe engxenyeni yesekhondi ku-TPU, cishe izikhathi eziyi-100 ngokushesha kunezisekelo ezizenzakalelayo kuyilapho ifanisa ikhwalithi yazo nokuvumelana kwesipikha.
I-Technical Insight
I-SoundStorm imodelisa isigaba samazinga e-residual vector quantization (RVQ) asuka ku-SoundStream. Ngesikhathi sokuqeqeshwa, amathokheni angahleliwe ambozwa futhi imodeli ifunda ukubikezela. Uma kucatshangelwa ukuthi isebenzisa ukuhlaziya okuhambisanayo okusekelwe ukuzethemba: ekuphindaphindeni ngakunye ibikezela wonke amathokheni afihliwe, igcina aqiniseka kakhulu, futhi imaski kabusha amanye. Inquma amaleveli e-RVQ amaholoholo kuqala, bese kuba ngcono, ifinyelela umsindo ogcwele ngezinyathelo ezimbalwa kakhulu kunesizukulwane sethokheni nethokheni.
I-Mastering SoundStorm Parallel Audio Generation
I-SoundStorm iyimodeli Google ekhiqiza umsindo ekhiqiza inkulumo nomsindo ngokuhambisana kunethokheni eyodwa ngesikhathi, okwenza ukuhlanganiswa komsindo kwekhwalithi ephezulu kusheshe kakhulu. Ibalulekile ngoba inciphisa ukubambezeleka kwesizukulwane ngeziqeshana ezinde ukusuka kumaminithi ukuya kumasekhondi ngaphandle kokudela ukwethembeka. I-SoundStorm Parallel Audio Generation ihlala ekuhambeni komsebenzi okulalelwayo-AI okuguqula inkulumo, umculo, nomsindo wokuxhumana, ukufinyeleleka, nokukhiqizwa kwemidiya. Ukuze wakhe ukuqonda okujulile, phatha i-SoundStorm Parallel Audio Generation njengemodeli yokusebenza, hhayi isici esisodwa: chaza imiphumela efiselekayo, ucacise ukucabanga, futhi uhlukanise lokho uhlelo olungakwenza ngokwethembeka kulokho okusadinga ukwahlulela kochwepheshe.
Empeleni, amaqembu aqinile asebenzisa i-SoundStorm Parallel Audio Generation aphatha ikhwalithi, ukubambezeleka, kanye nemvume njengezingxenye ezibalulekile ngokulinganayo zesu lokuthumela. Babhala imibandela yempumelelo ecacile, ukuhlola okuqhathaniswa nedatha engokoqobo nokugeleza komsebenzi, futhi baphindaphinde ngokusekelwe kumaphethini okuhluleka aqashiwe esikhundleni sokuwina kwebhentshimakhi yesikhathi esisodwa. Yilapho ukuqonda kwethiyori kuguquka kube amandla ahlala njalo kuwo wonke umkhiqizo, inqubomgomo, kanye nokusebenza.
Ithuthukisa ukufinyeleleka ngokuloba, ukulandisa, nezixhumi ezibonakalayo zezwi. Ngesikhathi esifanayo, ukusetshenziswa kabi kwezwi kanye nezingozi zokuzenza ongeyena ziyakhuphuka uma imvume ingekho. Indlela eqine kakhulu iwukuhlanganisa isivinini sokuhlola nesiyalo sokuphatha: qhuba abashayeli bezindiza, bamba ubufakazi, ushicilele amalogi ezinqumo, futhi ubuyekeze izivikelo ngokuqhubekayo njengoba imodeli yokuziphatha, okulindelwe ngabasebenzisi, kanye nezimfuneko zokulawula zishintsha.
I-Strategic Impact
Ithuthukisa ukufinyeleleka ngokuloba, ukulandisa, nezixhumi ezibonakalayo zezwi.
Ithuthukisa ukufinyeleleka ngokuloba, ukulandisa, nezixhumi ezibonakalayo zezwi. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.
Amaqembu emidiya angathumela umsindo opholishiwe ngokushesha ngamabhajethi amancane.
Amaqembu emidiya angathumela umsindo opholishiwe ngokushesha ngamabhajethi amancane. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.
Amasistimu abhekene nekhasimende angacubungula ukusebenzelana okukhulunyiwe ngesilinganiso esikhulu.
Amasistimu abhekene nekhasimende angacubungula ukusebenzelana okukhulunyiwe ngesilinganiso esikhulu. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.
Ukuqaliswa Komhlaba Wangempela
Ikhiqiza izingxoxo ezikhulunywayo zemizuzwana engama-30 zabasizi bezwi be-AI ngaphansi kwesekhondi
Ihlanganisa izingxoxo ezinezikhathi eziningi ezinamazwi esipikha angashintshi e-prototyping
Inika amandla ukubambezeleka okuphansi kombhalo ukuya enkulumweni kuma-ejenti asebenzisanayo lapho amamodeli e-autoregressive esalibele
Ukukhiqiza umsindo olandiswayo wefomu elide ngokushesha ngokugcwalisa amathokheni e-acoustic ngokuhambisana
Amaphethini Okusebenzisa
I-SoundStorm Parallel Audio Generation isebenza
Ikhiqiza izingxoxo ezikhulunywayo zemizuzwana engama-30 zabasizi bezwi be-AI ngaphansi kwesekhondi.
Ukukhiqiza izingxoxo ezikhulunywayo zemizuzwana engama-30 zabasizi bezwi be-AI ngaphansi kwesekhondi Amaqembu ngokuvamile athola imiphumela engcono uma echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.
I-SoundStorm Parallel Audio Generation isebenza
Ihlanganisa izingxoxo ezinezikhathi eziningi ezinamazwi esipikha angashintshi e-prototyping.
Ukuhlanganiswa kwezingxoxo ezishintshashintshashintshayo ngamazwi ezipikha ezingaguquguquki zamaQembu e-prototyping ngokuvamile kuthola imiphumela engcono uma echaza ikhwalithi ephezulu ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.
I-SoundStorm Parallel Audio Generation isebenza
Inika amandla ukubambezeleka okuphansi kombhalo ukuya enkulumweni kuma-ejenti asebenzisanayo lapho amamodeli e-autoregressive esalibele.
Inika amandla umbhalo ube-inkulumo wokubambezeleka okuphansi kuma-ejenti asebenzisanayo lapho amamodeli e-autoregressive abambezeleka Amaqembu ngokuvamile athola imiphumela engcono uma echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.
I-SoundStorm Parallel Audio Generation isebenza
Ukukhiqiza umsindo olandiswayo wefomu elide ngokushesha ngokugcwalisa amathokheni e-acoustic ngokuhambisana.
Ukukhiqiza umsindo olandiswayo wefomu elide ngokushesha ngokugcwalisa amathokheni e-acoustic ngokuhambisana Amaqembu ngokuvamile athola imiphumela engcono uma echaza ikhwalithi ephezulu ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.
Izingozi & Guardrails
Ukusetshenziswa kabi kwezwi kanye nezingozi zokuzenza ongeyena ziyanda uma imvume ingekho.
Ukunemba kungase kwehle kuzo zonke izinhlobo zokuphimisela, izilimi zesigodi, noma izindawo ezinomsindo.
Umsindo wokwenziwa ungenziwa iphutha njengenkulumo eyiqiniso ngaphandle kokulebula okucacile.
Ukuqalisa Umhlahlandlela
Thola imvume esobala yokuthwebula izwi, ukuhlanganisa, nokusebenzisa kabusha.
Thola imvume esobala yokuthwebula izwi, ukuhlanganisa, nokusebenzisa kabusha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.
Ikhwalithi yokuhlola kuzo zonke izipikha nezimo zangemuva.
Ikhwalithi yokuhlola kuzo zonke izipikha nezimo zangemuva. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.
Chaza ukuthi kunini lapho umuntu kufanele abuyekeze noma agunyaze okuphumayo.
Chaza ukuthi kunini lapho umuntu kufanele abuyekeze noma agunyaze okuphumayo. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.
Lebula umsindo wokwenziwa futhi ugcine amarekhodi atholakalayo ukuze aziphendulele.
Lebula umsindo wokwenziwa futhi ugcine amarekhodi atholakalayo ukuze aziphendulele. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.