Dubawa
SoundStream shine Google's na ƙarshen-zuwa-ƙarshen audio codec na jijiya wanda ke matse magana da kiɗa zuwa ƙananan bitrates yayin kiyaye inganci. Yana da mahimmanci saboda yana bugun codecs na gargajiya kamar Opus a daidai wannan bitrate kuma yana ba da ikon ƙirar sauti na zamani.
SoundStream Neural Codec yana zaune a cikin ayyukan audio-AI wanda ke canza magana, kiɗa, da sauti don sadarwa, samun dama, da samar da kafofin watsa labarai.
Zurfafa nutsewa
An gabatar da shi ta Google a cikin 2021, SoundStream cikakken tsarin neural codec ne wanda aka gina shi daga guntu guda uku waɗanda aka horar dasu tare: maɓalli mai jujjuyawar juzu'i wanda ke juyar da ɗanyen igiyar ruwa zuwa ƙaramin jeri na vectors, ragowar vector quantizer (RVQ) wanda ke ɓata waɗancan vectors, kuma mai jujjuya juyi. An horar da shi tare da asarar sake ginawa da kuma mai nuna wariya irin na GAN, don haka fitowar ta yi sauti na halitta maimakon kawai ta kusa. Siffar da ta tsaya tsayin daka ita ce 'mai iya daidaitawa' ko horarwar juzu'i: ƙirar ƙira ɗaya na iya aiki a tsakanin bitrates daga kusan 3 zuwa 18 kbps kawai ta amfani da ƙarin ko ƙananan yadudduka na ƙididdigewa, ba tare da sake horarwa ba. A 3 kbps an ba da rahoton cewa ya fi Opus a 12 kbps a cikin gwajin sauraron sauraro, sarrafa magana, kiɗa, da sauti na gabaɗaya a cikin tsari ɗaya wanda zai iya aiki a ainihin lokacin akan CPU na wayar hannu.
Fahimtar Fasaha
Siffar igiyar igiyar ruwa ta ratsa ta cikin juzu'i masu rarrabuwar kawuna waɗanda suka yi ƙasa da misali sosai, suna samar da abin haɗawa ɗaya a kowane firam (misali firam 75/daƙiƙa). RVQ sannan ya sanya kowane sakawa azaman tarin fihirisar codebook. Bitrate yayi daidai da adadin firam yawan adadin lokutan ƙididdigewa mai aiki a kowane littafin lamba. Quantizer dropout ba da gangan ya yanke jigon RVQ yayin horo, yana tilasta wa littattafan farko don ɗaukar mahimman bayanai don haka codec ɗin ya ƙasƙanta da kyau a ƙananan farashi.
Jagorar SoundStream Neural Codec
SoundStream shine Google's na ƙarshen-zuwa-ƙarshen audio codec na jijiya wanda ke matse magana da kiɗa zuwa ƙananan bitrates yayin kiyaye inganci. Yana da mahimmanci saboda yana bugun codecs na gargajiya kamar Opus a daidai wannan bitrate kuma yana ba da ikon ƙirar sauti na zamani. SoundStream Neural Codec yana zaune a cikin ayyukan audio-AI wanda ke canza magana, kiɗa, da sauti don sadarwa, samun dama, da samar da kafofin watsa labarai. Don gina zurfin fahimta, bi SoundStream Neural Codec a matsayin samfurin aiki, ba fasali ɗaya ba: ayyana sakamakon da ake so, bayyana zato, da kuma raba abin da tsarin zai iya yi da dogaro daga abin da har yanzu yana buƙatar yanke hukunci na ƙwararru.
A aikace, ƙungiyoyi masu ƙarfi da ke amfani da SoundStream Neural Codec suna kula da inganci, jinkiri, da yarda a matsayin daidai mahimman sassa na dabarun turawa. Suna rubuta ƙayyadaddun ƙa'idodin nasara, gwaji akan bayanan gaskiya da gudanawar aiki, da jujjuyawar bisa ga tsarin gazawar da aka lura maimakon cin nasara na lokaci ɗaya. Wannan shine inda fahimtar ka'idar ta juya zuwa iyawa mai dorewa a cikin samfura, manufofi, da ayyuka.
Yana inganta samun dama ta hanyar rubutu, ba da labari, da mu'amalar murya. A lokaci guda, rashin amfani da murya da haɗarin kwaikwaya yana ƙaruwa lokacin da aka rasa izini. Hanyar da ta fi dacewa ita ce haɗa saurin gwaji tare da horon gudanarwa: gudanar da matukin jirgi, kama shaida, buga rajistan ayyukan yanke shawara, da ci gaba da sabunta abubuwan tsaro kamar yadda halayen ƙira, tsammanin mai amfani, da buƙatun tsari ke tasowa.
Dabarun Tasiri
Yana inganta samun dama ta hanyar rubutu, ba da labari, da mu'amalar murya.
Yana inganta samun dama ta hanyar rubutu, ba da labari, da mu'amalar murya. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.
Ƙungiyoyin kafofin watsa labaru na iya jigilar sauti mai gogewa cikin sauri tare da ƙaramin kasafin kuɗi.
Ƙungiyoyin kafofin watsa labaru na iya jigilar sauti mai gogewa cikin sauri tare da ƙaramin kasafin kuɗi. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.
Tsarin fuskantar abokin ciniki na iya aiwatar da hulɗar magana a mafi girman ma'auni.
Tsarin fuskantar abokin ciniki na iya aiwatar da hulɗar magana a mafi girman ma'auni. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.
Aiwatar da Gaskiyar Duniya
Matsa kiran murya zuwa ~3 kbps yayin da yake ƙara bayyanawa fiye da na'urori na gado a mafi girman bitrates
Samar da sahihan alamun odiyo waɗanda ke ciyar da Google's AudioLM da samfuran ƙirƙira na MusicLM
Sauraron sauti mai ƙarancin bandwidth na ainihin lokaci akan na'urorin hannu tare da rufaffen kan-CPU da yankewa
Ajiye ko watsa kiɗa da sautin yanayi da kyau a cikin ƙirar ƙira ɗaya wanda ke sarrafa kowane nau'in abun ciki
Hanyoyin Aiwatarwa
SoundStream Neural Codec a aikace
Matsa kiran murya zuwa ~3 kbps yayin da yake ƙara bayyanawa fiye da na'urori na gado a mafi girman bitrates.
Matsa kiran murya zuwa ~ 3 kbps yayin da yake fitowa fili fiye da na'urorin codecs a mafi girman bitrates Ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don shari'o'in gefe, da bin duk nasarorin samarwa da ƙimar kuskure a kan lokaci.
SoundStream Neural Codec a aikace
Samar da sahihan alamun odiyo waɗanda ke ciyar da Google's AudioLM da samfuran ƙirƙira na MusicLM.
Ƙirƙirar alamun sauti masu hankali waɗanda ke ciyar da Google's AudioLM da samfuran ƙirƙira na MusicLM Ƙungiyoyi yawanci suna samun sakamako mafi kyau idan suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don ƙararraki, da bin diddigin nasarorin samarwa da tsadar kurakurai a kan lokaci.
SoundStream Neural Codec a aikace
Sauraron sauti mara ƙarfi na ɗan lokaci na gaske akan na'urorin hannu tare da rufaffen kan-CPU da ƙaddamarwa.
Yawowar sauti mai ƙarancin bandwidth na lokaci-lokaci akan na'urorin hannu tare da rufaffiyar-CPU da ƙwanƙwasa Ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don ƙararraki, da bin diddigin nasarorin samarwa da ƙimar kuskure akan lokaci.
SoundStream Neural Codec a aikace
Ajiye ko watsa kiɗa da sautin yanayi da kyau a cikin ƙirar ƙira ɗaya wanda ke sarrafa kowane nau'in abun ciki.
Ajiye ko watsa kiɗa da sauti na yanayi yadda ya kamata a cikin ƙira ɗaya wanda ke sarrafa duk nau'ikan abun ciki Ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don ƙararraki, da kuma bin diddigin nasarorin samarwa da ƙimar kuskure akan lokaci.
Hatsari & Tsare-tsare
Rashin amfani da murya da haɗarin kwaikwaya yana ƙaruwa lokacin da aka rasa izini.
Daidaituwa na iya faɗuwa cikin lafuzza, yaruka, ko mahalli masu hayaniya.
Ana iya kuskuren sauti na roba don ingantacciyar magana ba tare da bayyananniyar lakabi ba.
Taswirar Hanya
Sami tabbataccen izini don ɗaukar murya, cloning, da sake amfani.
Sami tabbataccen izini don ɗaukar murya, cloning, da sake amfani. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.
Gwajin ingantattun masu magana daban-daban da yanayin baya.
Gwajin ingantattun masu magana daban-daban da yanayin baya. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.
Ƙayyade lokacin da dole ne ɗan adam ya duba ko ya amince da abubuwan da aka fitar.
Ƙayyade lokacin da dole ne ɗan adam ya duba ko ya amince da abubuwan da aka fitar. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.
Yi lakabin sauti na roba da kuma adana bayanan da aka tabbatar don yin lissafi.
Yi lakabin sauti na roba da kuma adana bayanan da aka tabbatar don yin lissafi. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.