Dubawa
VQ-VAE tana matsa hotuna, sauti, ko bidiyo zuwa cikin ƙaramin grid na lambobi masu hankali waɗanda aka zana daga littafin lambar koyo, maimakon lambobi masu ci gaba. Wannan ƙwaƙƙwaran ƙwanƙwasa yana ƙyale samfuran jeri masu ƙarfi kamar Transformers su ɗauki kafofin watsa labarai azaman 'alamu', kamar kalmomi.
VQ-VAE da Latents masu hankali na cikin ayyukan aikin hangen nesa na kwamfuta wanda ke fassara ko samar da kafofin watsa labarai na gani don bincike, ayyuka, da kerawa.
Zurfafa nutsewa
VQ-VAE (Vector Quantized Variational Autoencoder), wanda van den Oord da abokan aiki a DeepMind suka gabatar a cikin 2017, wani autoencoder ne wanda keɓaɓɓen sarari yake. Mai rikodin rikodi yana juya hoto zuwa grid na vectors masu ci gaba; kowane vector sai a tsinke shi zuwa mafi kusa shigarsa a cikin wani koyo na codebook na embeddings (vector quantization). Mai ƙaddamarwa yana sake gina hoton daga waɗannan lambobin da aka ƙididdige su. Saboda latents yanzu ƙayyadaddun ƙamus ne na fihirisa, wani keɓantaccen samfurin zai iya koyan rarraba su kuma ya samar da sabon abun ciki. Wannan girke-girke na mataki biyu yana iko da DALL-E 1, Jukebox don kiɗa, da VQGAN, wanda ke ƙara hasarar fahimta da ƙiyayya don sake ginawa. VQ-VAE-2 ta tattara kudurori da yawa don samar da hotuna masu inganci.
Fahimtar Fasaha
Matakin ƙididdigewa (neman makwabcin kusa-argmin) ba shi da bambanci, don haka VQ-VAE yana amfani da madaidaicin ƙididdigewa: ana kwafi gradients kai tsaye daga shigar da dikodi baya zuwa fitarwar encoder kamar ƙididdigewa shine ainihin. Horon ya haɗu da asarar sake ginawa, asarar littafin rikodin abubuwan da ke jawo abubuwan haɗawa zuwa abubuwan da aka haɗa, da kuma asarar sadaukarwa da ke kiyaye rikodin rikodi zuwa lambobin da aka zaɓa. Rashin gazawar gama gari shine rugujewar littafin, inda ake amfani da ƴan lambobi.
Jagorar VQ-VAE da Latent mai hankali
VQ-VAE tana matsa hotuna, sauti, ko bidiyo zuwa cikin ƙaramin grid na lambobi masu hankali waɗanda aka zana daga littafin lambar koyo, maimakon lambobi masu ci gaba. Wannan ƙwaƙƙwaran ƙwanƙwasa yana ƙyale samfuran jeri masu ƙarfi kamar Transformers su ɗauki kafofin watsa labarai azaman 'alamu', kamar kalmomi. VQ-VAE da Latents masu hankali na cikin ayyukan aikin hangen nesa na kwamfuta wanda ke fassara ko samar da kafofin watsa labarai na gani don bincike, ayyuka, da kerawa. Don gina zurfin fahimta, bi da VQ-VAE da Latent Latent a matsayin samfurin aiki, ba sifa ɗaya ba: ayyana sakamakon da ake so, fayyace zato, da raba abin da tsarin zai iya yi da dogaro daga abin da har yanzu ke buƙatar yanke hukunci na ƙwararru.
A aikace, ƙungiyoyi masu ƙarfi masu amfani da VQ-VAE da Latent Latents masu hankali suna daidaita daidaito tare da gaskiyar aiki kamar ingancin bayanai, bambancin haske, da daidaiton lakabi. Suna rubuta ƙayyadaddun ƙa'idodin nasara, gwaji akan bayanan gaskiya da gudanawar aiki, da jujjuyawar bisa ga tsarin gazawar da aka lura maimakon cin nasara na lokaci ɗaya. Wannan shine inda fahimtar ka'idar ta juya zuwa iyawa mai dorewa a cikin samfura, manufofi, da ayyuka.
Kayayyakin AI na iya sarrafa aiki da bincike, ganowa, da ayyuka masu alama a sikelin. A lokaci guda, Haƙƙin Hoto da yarda na iya zama haɗari na shari'a idan ba a fayyace ba. Hanyar da ta fi dacewa ita ce haɗa saurin gwaji tare da horon gudanarwa: gudanar da matukin jirgi, kama shaida, buga rajistan ayyukan yanke shawara, da ci gaba da sabunta abubuwan tsaro kamar yadda halayen ƙira, tsammanin mai amfani, da buƙatun tsari ke tasowa.
Dabarun Tasiri
Kayayyakin AI na iya sarrafa aiki da bincike, ganowa, da ayyuka masu alama a sikelin.
Kayayyakin AI na iya sarrafa aiki da bincike, ganowa, da ayyuka masu alama a sikelin. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.
Ƙungiyoyin ƙirƙira za su iya samar da ra'ayoyi cikin sauri tare da ƙarancin bita da hannu.
Ƙungiyoyin ƙirƙira za su iya samar da ra'ayoyi cikin sauri tare da ƙarancin bita da hannu. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.
Ayyuka na iya amfani da siginar hoto da bidiyo waɗanda a baya suke da wahalar aiwatarwa.
Ayyuka na iya amfani da siginar hoto da bidiyo waɗanda a baya suke da wahalar aiwatarwa. A cikin ƙawance masu inganci, ana fassara wannan zuwa ƙa'idodin aiki waɗanda za a iya aunawa, iyakokin ikon mallaka, da kuma bita-da-kullin bita don ƙungiyoyi su iya haɓaka kwarin gwiwa a maimakon ɓata shakku.
Aiwatar da Gaskiyar Duniya
DALL-E 1 ya yi amfani da tambarin VQ-VAE mai hankali don haka Transformer zai iya samar da hotuna azaman jerin fihirisar codebook.
VQGAN ya haɗe VQ-VAE tare da hasarar gaba da hasashe don samar da kintsattse, manyan alamun hoto don tsara fasaha.
OpenAI's Jukebox ya yi amfani da VQ-VAE zuwa ga danyen sauti, matsar da kida cikin keɓaɓɓen lambobin don ƙirar ƙira.
VQ-VAE-2 ya tattara manyan latents masu ma'ana don haɗa nau'ikan hotuna masu aminci da yawa masu adawa da GANs na zamanin sa.
Hanyoyin Aiwatarwa
VQ-VAE da Latents masu hankali a aikace
DALL-E 1 ya yi amfani da tambarin VQ-VAE mai hankali don haka Transformer zai iya samar da hotuna azaman jerin fihirisar codebook.
DALL-E 1 ta yi amfani da tambarin VQ-VAE mai hankali don haka mai Canjawa zai iya samar da hotuna azaman jerin jerin fihirisar code Ƙungiyoyi yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don shari'o'in gefe, da bin diddigin nasarorin samarwa da ƙimar kuskure akan lokaci.
VQ-VAE da Latents masu hankali a aikace
VQGAN ya haɗe VQ-VAE tare da hasarar gaba da hasashe don samar da kintsattse, manyan alamun hoto don tsara fasaha.
VQGAN ya haɗu da VQ-VAE tare da hasarar gaba da fahimta don samar da kyakykyawan, manyan alamun hoto don Ƙungiyoyin tsara fasaha yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don shari'o'i, da kuma bin diddigin nasarorin samarwa da ƙimar kuskure akan lokaci.
VQ-VAE da Latents masu hankali a aikace
OpenAI's Jukebox ya yi amfani da VQ-VAE zuwa ga danyen sauti, matsar da kida cikin keɓaɓɓen lambobin don ƙirar ƙira.
OpenAI's Jukebox ya yi amfani da VQ-VAE don ƙarar sauti, matsawa kida cikin ƙididdiga masu ƙima don ƙirar ƙirar ƙira yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don ƙararraki, da bin diddigin nasarorin samarwa da tsadar kurakurai a kan lokaci.
VQ-VAE da Latents masu hankali a aikace
VQ-VAE-2 ya tattara manyan latents masu ma'ana don haɗa nau'ikan hotuna masu aminci da yawa masu adawa da GANs na zamanin sa.
VQ-VAE-2 da aka tattara manyan latents masu ma'ana don haɗa nau'ikan hotuna masu girman gaske masu adawa da GANs na zamanin Ƙungiyoyin yawanci suna samun sakamako mafi kyau lokacin da suka ayyana ma'auni masu inganci a gaba, kiyaye hanyar haɓakar ɗan adam don ƙararraki, da bin diddigin nasarorin samarwa da ƙimar kuskure akan lokaci.
Hatsari & Tsare-tsare
Haƙƙoƙin hoto da yarda na iya zama haɗari na shari'a idan ba a fayyace ba.
Ayyukan samfuri na iya bambanta a ko'ina cikin haske, ƙididdiga, da mahalli.
Ƙarya tabbataccen ƙila ba za a iya lura da shi ba sai dai idan an kula da ƙofofin amincewa.
Taswirar Hanya
Ƙayyade ma'auni na karɓa don daidaito, tunowa, da farashi na kuskure.
Ƙayyade ma'auni na karɓa don daidaito, tunowa, da farashi na kuskure. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.
Gwada tare da bayanan da suka dace da ainihin yanayin samarwa.
Gwada tare da bayanan da suka dace da ainihin yanayin samarwa. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.
Ƙara bita na ɗan adam don ƙarancin amincewa ko tsinkaya mai tasiri.
Ƙara bita na ɗan adam don ƙarancin amincewa ko tsinkaya mai tasiri. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.
Bi diddigin ƙirar ƙira kuma sake ingantawa bayan canje-canjen kamara ko saitin bayanai.
Bi diddigin ƙirar ƙira kuma sake ingantawa bayan canje-canjen kamara ko saitin bayanai. Ɗauki kowane mataki azaman ƙofar shaida: idan ba a cika sharuɗɗa ba, dakatar da fitar, rufe tazarar, sannan kawai faɗaɗa amfani.