Dubawa
SoundStorm samfurin tsarar sauti Google ne wanda ke samar da magana da sauti a layi daya maimakon alamar guda ɗaya a lokaci guda, yana samar da ingantaccen sauti mai inganci cikin sauri. Yana da mahimmanci saboda yana yanke jinkirin tsara don dogon shirye-shiryen bidiyo daga mintuna zuwa daƙiƙa ba tare da sadaukar da aminci ba.
SoundStorm Parallel Audio Generation yana zaune a cikin ayyukan audio-AI wanda ke canza magana, kiɗa, da sauti don sadarwa, samun dama, da samar da kafofin watsa labarai.
Zurfafa nutsewa
SoundStorm, wanda Google ya gabatar a cikin 2023, yana haifar da sauti mai wakilta a matsayin saƙon sauti mai hankali daga lambar kodi na jijiya mai suna SoundStream. Samfuran da suka gabata kamar AudioLM sun samar da waɗannan alamomin kai tsaye, suna tsinkayar kowane alamar a jere, wanda ke jinkirin dogon sauti. SoundStorm a maimakon haka yana amfani da hanyar da ba ta kai tsaye ba, tsarin tushen abin rufe fuska wanda aka aro daga ƙirar tsara hoto kamar MaskGIT. Yana farawa da galibin alamun rufe fuska kuma yana cika su sama da ɗimbin matakai na yanke hukunci, yana tsinkayar alamu da yawa lokaci guda a layi daya. Yanayi akan alamomin ma'ana (daga samfuri kamar AudioLM ko SPEAR-TTS), yana iya haɗa daƙiƙa 30 na tattaunawa ta halitta a cikin kusan rabin daƙiƙa akan TPU, kusan sau 100 cikin sauri fiye da layin autoregressive yayin dacewa da ingancinsu da daidaiton magana.
Fahimtar Fasaha
SoundStorm yana ƙirƙira matsayi na matakan ƙididdigewa na saura (RVQ) daga SoundStream. Yayin horo, ana rufe alamun bazuwar kuma samfurin ya koyi tsinkaya su. A cikin ƙididdiga yana gudanar da yanke hukunci daidai gwargwado na tushen aminci: a cikin kowane juzu'i yana tsinkayar duk abin rufe fuska, yana kiyaye mafi ƙarfin gwiwa, kuma yana sake rufe sauran. Yana ƙaddamar manyan matakan RVQ da farko, sannan mafi kyawu, yana kaiwa ga cikakken sauti cikin ƴan matakai kaɗan fiye da tsarar alama ta alama.
Jagorar SautiStorm Daidaitacce Generation Audio
SoundStorm samfurin tsarar sauti Google ne wanda ke samar da magana da sauti a layi daya maimakon alamar guda ɗaya a lokaci guda, yana samar da ingantaccen sauti mai inganci cikin sauri. Yana da mahimmanci saboda yana yanke jinkirin tsara don dogon shirye-shiryen bidiyo daga mintuna zuwa daƙiƙa ba tare da sadaukar da aminci ba. SoundStorm Parallel Audio Generation yana zaune a cikin ayyukan audio-AI wanda ke canza magana, kiɗa, da sauti don sadarwa, samun dama, da samar da kafofin watsa labarai. Don gina zurfin fahimta, bi SoundStorm Parallel Audio Generation a matsayin samfurin aiki, ba sifa ɗaya ba: ayyana sakamakon da ake so, fayyace zato, da raba abin da tsarin zai iya yi da dogaro daga abin da har yanzu yana buƙatar yanke hukunci na ƙwararru.
A aikace, ƙungiyoyi masu ƙarfi da ke amfani da SoundStorm Parallel Audio Generation suna ɗaukar inganci, jinkiri, da yarda a matsayin daidaitattun sassa na dabarun turawa. Suna rubuta ƙayyadaddun ƙa'idodin nasara, gwaji akan bayanan gaskiya da gudanawar aiki, da jujjuyawar bisa ga tsarin gazawar da aka lura maimakon cin nasara na lokaci ɗaya. Wannan shine inda fahimtar ka'idar ta juya zuwa iyawa mai dorewa a cikin samfura, manufofi, da ayyuka.
Yana inganta samun dama ta hanyar rubutu, ba da labari, da mu'amalar murya. A lokaci guda, rashin amfani da murya da haɗarin kwaikwaya yana ƙaruwa lokacin da aka rasa izini. Hanyar da ta fi dacewa ita ce haɗa saurin gwaji tare da horon gudanarwa: gudanar da matukin jirgi, kama shaida, buga rajistan ayyukan yanke shawara, da ci gaba da sabunta abubuwan tsaro kamar yadda halayen ƙira, tsammanin mai amfani, da buƙatun tsari ke tasowa.
Dabarun Tasiri
Yana inganta samun dama ta hanyar rubutu, ba da labari, da mu'amalar murya.
Yana inganta samun dama ta hanyar rubutu, ba da labari, da mu'amalar murya. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.
Ƙungiyoyin kafofin watsa labaru na iya jigilar sauti mai gogewa cikin sauri tare da ƙaramin kasafin kuɗi.
Ƙungiyoyin kafofin watsa labaru na iya jigilar sauti mai gogewa cikin sauri tare da ƙaramin kasafin kuɗi. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.
Tsarin fuskantar abokin ciniki na iya aiwatar da hulɗar magana a mafi girman ma'auni.
Tsarin fuskantar abokin ciniki na iya aiwatar da hulɗar magana a mafi girman ma'auni. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.
Aiwatar da Gaskiyar Duniya
Samar da maganganun magana na daƙiƙa 30 don mataimakan muryar AI a ƙarƙashin daƙiƙa guda
Haɗa tattaunawa mai juyi da yawa tare da daidaitattun muryoyin lasifika don yin samfuri
Ƙaddamar da ƙaramin-latency rubutu-zuwa-magana a cikin ma'aikatu masu mu'amala inda samfuran autoregressive suka yi rauni
Ƙirƙirar sauti mai tsayi mai tsayi da sauri ta hanyar cike alamun sauti a layi daya
Hanyoyin Aiwatarwa
SoundStorm Parallel Audio Generation a aikace
Samar da maganganun magana na daƙiƙa 30 don mataimakan muryar AI a ƙarƙashin daƙiƙa guda.
Samar da maganganun magana na 30 na biyu don masu taimakawa muryar AI a ƙarƙashin ƙungiyoyi na biyu yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ƙofofin inganci a gaba, kiyaye hanyar haɓakar ɗan adam don shari'o'in gefe, da bin duk nasarorin samarwa da ƙimar kuskure akan lokaci.
SoundStorm Parallel Audio Generation a aikace
Haɗa tattaunawa mai juyi da yawa tare da daidaitattun muryoyin lasifika don yin samfuri.
Haɗin tattaunawa mai juyi da yawa tare da daidaitattun muryoyin lasifika don yin samfuri Ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ƙima masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don shari'o'i, da bin duk nasarorin samarwa da ƙimar kuskure akan lokaci.
SoundStorm Parallel Audio Generation a aikace
Ƙaddamar da ƙaramin-latency rubutu-zuwa-magana a cikin ma'aikatu masu mu'amala inda samfuran autoregressive suka yi rauni.
Ƙaddamar da ƙananan latency rubutu-zuwa-magana a cikin ma'aikata masu hulɗa inda ƙungiyoyi masu cin gashin kansu sukan sami sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don ƙararraki, da kuma bin duk nasarorin samarwa da ƙimar kuskure akan lokaci.
SoundStorm Parallel Audio Generation a aikace
Ƙirƙirar sauti mai tsayi mai tsayi da sauri ta hanyar cike alamun sauti a layi daya.
Samar da sauti mai tsayi mai tsayi da sauri ta hanyar cike alamun sauti a cikin ƙungiyoyi masu kama da juna yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ƙofofin inganci a gaba, kiyaye hanyar haɓakar ɗan adam don shari'o'in gefe, da bin duk nasarorin samarwa da ƙimar kuskure akan lokaci.
Hatsari & Tsare-tsare
Rashin amfani da murya da haɗarin kwaikwaya yana ƙaruwa lokacin da aka rasa izini.
Daidaituwa na iya faɗuwa cikin lafuzza, yaruka, ko mahalli masu hayaniya.
Ana iya kuskuren sauti na roba don ingantacciyar magana ba tare da bayyananniyar lakabi ba.
Taswirar Hanya
Sami tabbataccen izini don ɗaukar murya, cloning, da sake amfani.
Sami tabbataccen izini don ɗaukar murya, cloning, da sake amfani. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.
Gwajin ingantattun masu magana daban-daban da yanayin baya.
Gwajin ingantattun masu magana daban-daban da yanayin baya. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.
Ƙayyade lokacin da dole ne ɗan adam ya duba ko ya amince da abubuwan da aka fitar.
Ƙayyade lokacin da dole ne ɗan adam ya duba ko ya amince da abubuwan da aka fitar. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.
Yi lakabin sauti na roba da kuma adana bayanan da aka tabbatar don yin lissafi.
Yi lakabin sauti na roba da kuma adana bayanan da aka tabbatar don yin lissafi. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.