Visual AI GUIDE

Imagen Vhidhiyo Cascades

Imagen Vhidhiyo ndeye Google's 2022 text-to-video system inovaka klipu kuburikidza nekaseja yemamodhiyo manomwe ediffusion, imwe neimwe ichiwedzera mamwe mafuremu kana mamwe magadzirirwo.

Overview

Imagen Vhidhiyo ndeye Google's 2022 text-to-video system inovaka klipu kuburikidza nekaseja yemamodhiyo manomwe ediffusion, imwe neimwe ichiwedzera mamwe mafuremu kana mamwe magadzirirwo. Izvo zvine basa nekuti yakaratidza kuti kurongedza matanho ehunyanzvi kunogona kuburitsa yakakwira-tsanangudzo, yenguva pfupi yakapfava vhidhiyo kubva kune imwechete kukurumidza.

Imagen Vhidhiyo Cascades ndeyekombuta-yekuona workflows inodudzira kana kugadzira midhiya inooneka yekuongorora, mashandiro, uye kugadzira.

Deep Dive

Imagen Vhidhiyo, yakaunzwa na Google Tsvagiridzo muna Gumiguru 2022, inowedzera iyo Imagen mavara-kune-mufananidzo maitiro ekufamba. Iyo T5 yakaomeswa mavara encoder inoshandura iyo kukurumidza kuita yakapfuma yemutauro inomisikidza iyo mamiriro ega ega. Base diffusion modhi inotanga kuburitsa vhidhiyo diki, yakaderera-rero, kozoti nhanhatu yemamwe mamodhiyo ediffusion inochinjana kuita temporal super-resolution (kuwedzera mafuremu pakati pearipo) uye spatial super-resolution (inowedzera pixel resolution). Iyo yakazara pombi inobuda inosvika 1280x768 vhidhiyo pamafuremu makumi maviri nemana pasekondi, masekonzi akati wandei. Nekuda kwekuti kunzwisisa kwemutauro kwakadzama kunogara mune encoder yezvinyorwa, Imagen Vhidhiyo inogona kupa mavara anonyoreka, akasiyana hunyanzvi aesthetics, uye 3D-inoziva chinhu chinofamba, zvichiratidza kuti nekungwarira staging kurova kuyedza kuita zvese mune imwe hofori modhi.

Technical Insight

Iyo cascade inotsemura chizvarwa chakaomesesa chisingagoneki-chimwechete kuva matambudziko madiki. Modhi nomwe dzekuparadzira dzinomhanya mukutevedzana: imwe base jenareta pamwe neatatu epamhepo uye matatu enguva yepamusoro-resolution modhi. Imwe neimwe inomisikidzwa pane yekumisikidza nekukasira uye yapfuura nhanho yekubuda. Tekinoroji senge v-prediction parameterization uye inofambira mberi distillation inomhanyisa sampling, nepo classifier-yemahara nhungamiro inosimbisa kuomerera nekukurumidza padanho rega rega reketani.

Mastering Imagen Vhidhiyo Cascades

Imagen Vhidhiyo ndeye Google's 2022 text-to-video system inovaka klipu kuburikidza nekaseja yemamodhiyo manomwe ediffusion, imwe neimwe ichiwedzera mamwe mafuremu kana mamwe magadzirirwo. Izvo zvine basa nekuti yakaratidza kuti kurongedza matanho ehunyanzvi kunogona kuburitsa yakakwira-tsanangudzo, yenguva pfupi yakapfava vhidhiyo kubva kune imwechete kukurumidza. Imagen Vhidhiyo Cascades ndeyekombuta-yekuona workflows inodudzira kana kugadzira midhiya inooneka yekuongorora, mashandiro, uye kugadzira. Kuti uvake kunzwisisa kwakadzama, bata Imagen Vhidhiyo Cascades semuenzaniso wekushandisa, kwete chinhu chimwe chete: tsanangura zvinodiwa, kujekesa fungidziro, uye patsanura izvo zvinogona kuitwa nehurongwa hwakavimbika kubva kune zvichiri kuda kutonga kwenyanzvi.

Mukuita, zvikwata zvakasimba zvinoshandisa Imagen Vhidhiyo Cascades kuenzanirana nemaitiro ekushanda semhando yedata, kusiyana kwemwenje, uye kuenderana kwemazita. Ivo vanonyora zvakajeka maitiro ebudiriro, bvunzo vachipokana ne data rechokwadi uye mafambiro ebasa, uye iterate zvichibva pane zvakacherechedzwa maitiro ekutadza kwete kuhwina-nguva imwe chete yebhenji. Apa ndipo apo kunzwisisa kwe theoretical kunoshanduka kuve kugona kwakasimba pane chigadzirwa, mutemo, uye mashandiro.

Visual AI inogona kuita otomatiki yekuongorora, yekuona, uye yekumaka mabasa pachiyero. Panguva imwecheteyo, kodzero dzeMufananidzo uye kubvumirwa kunogona kuve njodzi dzepamutemo kana hunhu husina kujeka. Nzira yakatsiga ndeyekubatanidza kukurumidza kuyedza nekutonga: mhanyisa vatyairi vendege, tora humbowo, buritsa matanda esarudzo, uye urambe uchivandudza chengetedzo semaitiro emuenzaniso, zvinotarisirwa nemushandisi, uye zvinodikanwa zvekutonga.

Strategic Impact

Visual AI inogona kuita otomatiki yekuongorora, yekuona, uye yekumaka mabasa pachiyero.

Visual AI inogona kuita otomatiki yekuongorora, yekuona, uye yekumaka mabasa pachiyero. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Zvikwata zvekugadzira zvinogona prototype pfungwa nekukurumidza nekudzokororwa kwemaoko mashoma.

Zvikwata zvekugadzira zvinogona prototype pfungwa nekukurumidza nekudzokororwa kwemaoko mashoma. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Mashandisirwo anogona kushandisa masaini emifananidzo nemavhidhiyo ayo aimbove akaoma kugadzirisa.

Mashandisirwo anogona kushandisa masaini emifananidzo nemavhidhiyo ayo aimbove akaoma kugadzirisa. Mukutumirwa kwemhando yepamusoro, izvi zvinoshandurirwa kuita mitemo inoyerwa yekushanda, miganhu yevaridzi, uye tsika dzekudzokorora dzinodzokororwa kuitira kuti zvikwata zvikwire kuvimba pane kukwidza kusajeka.

Ramangwana reImagen Vhidhiyo Cascades

Cascaded pixel-space mapaipi airatidza pfungwa asi ari compute-inorema uye inononoka. Munda wakanyanya kuchinjika wakananga kune latent diffusion uye transformer backbones inogadzira munzvimbo yakamanikidzwa, yekucheka mutengo uchichengeta mhando. Zvakadaro, chidzidzo cheImagen Vhidhiyo, inoparadzanisa mabasa e'chii,' 'inofamba sei,' uye 'kupinza zvakadii,' inoenderera mberi ichizivisa dhizaini-matanho akawanda uye magadzirirwo ekugadzirisa, uye maitiro ayo eT5-conditioning akafurirwa gare gare kuvimbika kwepamusoro, mavara-akatendeka majenareta.

Real-World Implementation

Kugadzira klipu ine tsananguro yepamusoro ine mavara anonyoreka pa-screen kubva pakukurumidza

Kupa iyo yakafanana yakatsanangurwa chiitiko mumataera akawanda ehunyanzvi, kubva pawatercolor kuenda kune claymation

Kugadzira pfupi 3D-inoziva chinhu mifananidzo senge inotenderera, inofamba chivezwa

Kugadzira yakatsetseka 24fps kushambadzira kana pfungwa zvimedu zvakananga kubva kurondedzero yakanyorwa

Maitiro Ekuita

Imagen Vhidhiyo Cascades mukuita

Kugadzira klipu ine tsananguro yepamusoro ine mavara anonyoreka pa-screen kubva pakukurumidza.

Kugadzira klipu ine tsananguro yepamusoro ine mavara anonyoreka pa-screen kubva kune yekukurumidza Matimu anowanzo kuwana mhedzisiro iri nani kana achinge atsanangura emhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi ekumucheto, uye kuteedzera zvese zvakawanikwa zvechigadzirwa uye mutengo wekukanganisa nekufamba kwenguva.

Imagen Vhidhiyo Cascades mukuita

Kupa iyo yakafanana yakatsanangurwa chiitiko mumataera akawanda ehunyanzvi, kubva pawatercolor kuenda kune claymation.

Kupa iyo yakafanana yakatsanangurwa chiitiko mumataera akawanda eunyanzvi, kubva pawatercolor kuenda ku claymation Matimu anowanzo kuwana mhedzisiro iri nani kana achinge atsanangura emhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi ekumucheto, uye kuteedzera zvese zvakawanikwa zvechigadzirwa nemitengo yekukanganisa nekufamba kwenguva.

Imagen Vhidhiyo Cascades mukuita

Kugadzira pfupi 3D-inoziva chinhu mifananidzo senge inotenderera, inofamba chivezwa.

Kugadzira mapfupi e-3D-anoziva chinhu animation senge inotenderera, inofamba zvivezwa Matimu anowanzo kuwana zvirinani zvibodzwa kana achinge atsanangura emhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi emupendero, uye kuteedzera zvese zvakawanikwa zvechigadzirwa uye mutengo wekukanganisa nekufamba kwenguva.

Imagen Vhidhiyo Cascades mukuita

Kugadzira yakatsetseka 24fps kushambadzira kana pfungwa zvimedu zvakananga kubva kurondedzero yakanyorwa.

Kugadzira yakatsetseka 24fps kushambadzira kana pfungwa zvimedu zvakananga kubva kune yakanyorwa tsananguro Zvikwata zvinowanzowana mhedzisiro iri nani kana vachinge vatsanangura emhando yepamusoro kumberi, chengetedza nzira yekukwira kwevanhu yemakesi ekumucheto, uye kuteedzera zvese zvakawanikwa zvechigadzirwa nemitengo yekukanganisa nekufamba kwenguva.

Njodzi & Guardrails

!

Kodzero dzemifananidzo uye kubvumirwa kunogona kuve njodzi dzepamutemo kana provenance isina kujeka.

!

Kuita kwemuenzaniso kunogona kusiyanisa kupenya, huwandu hwevanhu, uye nharaunda.

!

Manyepo enhema anogona kusacherechedzwa kunze kwekunge zvikumbaridzo zvekuvimba zvikatariswa.

Implementation Roadmap

1

Tsanangura maitiro ekugamuchirwa echokwadi, kurangarira, uye mutengo wekukanganisa.

Tsanangura maitiro ekugamuchirwa echokwadi, kurangarira, uye mutengo wekukanganisa. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

2

Edzai nedata rinoenderana nemamiriro chaiwo ekugadzira.

Edzai nedata rinoenderana nemamiriro chaiwo ekugadzira. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

3

Wedzera ongororo yemunhu kune yakaderera-kusavimbika kana yakakwirira-inokanganisa kufanotaura.

Wedzera ongororo yemunhu kune yakaderera-kusavimbika kana yakakwirira-inokanganisa kufanotaura. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

4

Tevera modhi kudonha uye simbisa mushure mekuchinja kwekamera kana dataset.

Tevera modhi kudonha uye simbisa mushure mekuchinja kwekamera kana dataset. Bata nhanho yega yega segedhi rehumbowo: kana maitiro asina kusangana, imbomira kuburitsa, vhara gaka, uye wobva wawedzera kushandiswa.

Ramba Uchiongorora