I-VISual AI GUIDE

I-VQGAN kanye ne-Codebook Image Synthesis

I-VQGAN icindezela izithombe zibe igridi yamathokheni ahlukene athathwe ebhukwini lekhodi elifundiwe, ivumela isiguquli ukuthi sikhiqize izithombe ngendlela efanayo namamodeli olimi akhiqiza ngayo umbhalo.

Uhlolojikelele

I-VQGAN icindezela izithombe zibe igridi yamathokheni ahlukene athathwe ebhukwini lekhodi elifundiwe, ivumela isiguquli ukuthi sikhiqize izithombe ngendlela efanayo namamodeli olimi akhiqiza ngayo umbhalo.

I-VQGAN kanye ne-Codebook Image Synthesis ingeyokugeleza kokusebenza kombono wekhompyutha okuhumusha noma okukhiqiza imidiya ebonakalayo ukuze ihlaziywe, isebenze, futhi isungule.

I-Deep Dive

I-VQGAN, eyethulwe ephepheni lango-2021 elithi 'Taming Transformers for High-Resolution Image Synthesis,' ihlanganisa i-autoencoder ye-vector-quantized autoencoder (VQVAE) nokuqeqeshwa okuphikisayo nokucabangayo. Isifaki khodi sibonisa isithombe kugridi encane yamavekhtha wesici; ivekhtha ngayinye ihlwithwa ekufakweni okuseduze kwe-codebook efundiwe, yithi, amakhodi ahlukene ayi-1024, kuguqule isithombe sibe ukulandelana kwamathokheni enamba. Isikhiphi sakha kabusha isithombe sisuka kulawo mathokheni, aqeqeshwe ngomcwasi we-GAN nokulahlekelwa kombono ukuze ukwakhiwa kabusha kubukeke kucijile kunokufiphala. Ngenxa yokuthi izithombe manje seziwukulandelana kwamathokheni ahlukene, i-autoregressive transformer ingakwazi ukuyifanisa njengolimi, ibikezele amathokheni ngayinye ngayinye. I-VQGAN inikeze amandla amathuluzi obuciko okuqala ombhalo uye esithombeni uma ebhangqwe nesiqondiso se-CLIP.

I-Technical Insight

Umsebenzi oyinhloko uwukubalwa kwevekhtha: okuphumayo kwesifaki khodi okuqhubekayo kushintshaniswa amavektha e-codebook aseduze, ngesilinganiso segradient 'esiqondile sidlule' ukuze isifaki khodi sisakwazi ukufunda naphezu kokubheka okungahlukani. Ukwengeza isihlukanisi se-GAN esisekelwe ku-patch phezu kwe-autoencoder yikho okuvumela i-VQGAN ukuthi isebenzise igridi yethokheni encane kakhulu (isb.

I-Mastering VQGAN kanye ne-Codebook Image Synthesis

I-VQGAN icindezela izithombe zibe igridi yamathokheni ahlukene athathwe ebhukwini lekhodi elifundiwe, ivumela isiguquli ukuthi sikhiqize izithombe ngendlela efanayo namamodeli olimi akhiqiza ngayo umbhalo. I-VQGAN kanye ne-Codebook Image Synthesis ingeyokugeleza kokusebenza kombono wekhompyutha okuhumusha noma okukhiqiza imidiya ebonakalayo ukuze ihlaziywe, isebenze, futhi isungule. Ukuze wakhe ukuqonda okujulile, phatha i-VQGAN ne-Codebook Image Synthesis njengemodeli yokusebenza, hhayi isici esisodwa: chaza imiphumela oyifunayo, ucacise ukucabanga, futhi uhlukanise lokho uhlelo olungakwenza ngokwethembeka kulokho okusadinga ukwahlulela kochwepheshe.

Empeleni, amaqembu aqinile asebenzisa i-VQGAN kanye ne-Codebook Image Synthesis yokunemba namaqiniso okusebenza njengekhwalithi yedatha, ukuhluka kokukhanya, nokuvumelana kwamalebula. Babhala imibandela yempumelelo ecacile, ukuhlola okuqhathaniswa nedatha engokoqobo nokugeleza komsebenzi, futhi baphindaphinde ngokusekelwe kumaphethini okuhluleka aqashiwe esikhundleni sokuwina kwebhentshimakhi yesikhathi esisodwa. Yilapho ukuqonda kwethiyori kuguquka kube amandla ahlala njalo kuwo wonke umkhiqizo, inqubomgomo, kanye nokusebenza.

I-Visual AI ingakwazi ukuhlola, ukutholwa, nokumaka imisebenzi esikalini. Ngesikhathi esifanayo, amalungelo ezithombe kanye nemvume kungaba ubungozi bomthetho uma ukutholakala kungacacile. Indlela eqine kakhulu iwukuhlanganisa isivinini sokuhlola nesiyalo sokuphatha: qhuba abashayeli bezindiza, bamba ubufakazi, ushicilele amalogi ezinqumo, futhi ubuyekeze izivikelo ngokuqhubekayo njengoba imodeli yokuziphatha, okulindelwe ngabasebenzisi, kanye nezimfuneko zokulawula zishintsha.

I-Strategic Impact

I-Visual AI ingakwazi ukuhlola, ukutholwa, nokumaka imisebenzi esikalini.

I-Visual AI ingakwazi ukuhlola, ukutholwa, nokumaka imisebenzi esikalini. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Amathimba aqanjiwe angakwazi ukulinganisa imiqondo ngokushesha ngezibuyekezo ezimbalwa ezenziwa mathupha.

Amathimba aqanjiwe angakwazi ukulinganisa imiqondo ngokushesha ngezibuyekezo ezimbalwa ezenziwa mathupha. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Imisebenzi ingasebenzisa amasiginali wesithombe nawevidiyo obekunzima ukuwenza ngaphambilini.

Imisebenzi ingasebenzisa amasiginali wesithombe nawevidiyo obekunzima ukuwenza ngaphambilini. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Ikusasa le-VQGAN kanye ne-Codebook Image Synthesis

Iresiphi yethokheni ye-VQGAN ehlukene yaba yisisekelo samamodeli ezithombe namavidiyo asuselwa kumathokheni, ukusuka ku-MaskGIT kuya ezinhlelweni ze-multimodal ezixuba amathokheni wesithombe nemibhalo ku-transformer eyodwa. Ucwaningo manje luphushela kuma-codebook amakhulu, anesikali noma angabheki agwema ukuwa kwe-codebook futhi abheke kumamodeli ahlanganisiwe lapho silulumagama esifanayo sihlanganisa izithombe, umsindo, nolimi, okuvumela noma yisiphi isizukulwane.

Ukuqaliswa Komhlaba Wangempela

Ukufaka isithombe ngekhodi kugridi engu-16x16 yamathokheni e-codebook ukuze i-transformer ikwazi ukwenza imodeli futhi iyenze kabusha

Ukumatanisa i-VQGAN nesiqondiso se-CLIP ukuze udale ubuciko be-surreal 'VQGAN+CLIP' AI oba negciwane ngo-2021

Ukucindezela izithombe zibe amakhodi ahlangene ukuze ugcine kahle noma ukuqeqeshwa kokukhiqiza

Isebenza njengethokheni yesithombe ngaphakathi kwamajeneretha amakhulu asuselwa kumathokheni afana ne-MaskGIT nama-multimodal transformers

Amaphethini Okusebenzisa

I-VQGAN kanye ne-Codebook Image Synthesis isebenza

Ukufaka isithombe ngekhodi kugridi engu-16x16 yamathokheni e-codebook ukuze isiguquli sikwazi ukwenza imodeli futhi siyenze kabusha.

Ukufaka isithombe ngekhodi kugridi engu-16x16 yamathokheni e-codebook ukuze isiguquli sikwazi ukumodela futhi sisenze kabusha Amaqembu ngokuvamile athola imiphumela engcono uma echaza izinga eliphezulu ngaphambili, agcine indlela yokukhuphuka komuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-VQGAN kanye ne-Codebook Image Synthesis isebenza

Ukumatanisa i-VQGAN nesiqondiso se-CLIP ukuze udale ubuciko be-surreal 'VQGAN+CLIP' AI obadlangile ngo-2021.

Ukumatanisa i-VQGAN nesiqondiso se-CLIP ukuze udale ubuciko be-surreal 'VQGAN+CLIP' AI oba negciwane ngo-2021 Amaqembu ngokuvamile athola imiphumela engcono uma echaza imingcele yekhwalithi ngaphambili, agcine indlela yokukhuphuka komuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-VQGAN kanye ne-Codebook Image Synthesis isebenza

Ukucindezela izithombe zibe amakhodi ahlangene ukuze ugcine kahle noma ukuqeqeshwa kokukhiqiza.

Ukucindezela izithombe zibe amakhodi acwebile ukuze kugcinwe ngendlela efanele noma ukuqeqeshwa okukhiqizayo okwehla nomfula Amathimba ngokuvamile athola imiphumela engcono uma echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi elandelela kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-VQGAN kanye ne-Codebook Image Synthesis isebenza

Isebenza njengethokheni yesithombe ngaphakathi kwamajeneretha amakhulu asuselwa kumathokheni afana ne-MaskGIT nama-multimodal transformers.

Isebenza njengethokheni yesithombe ngaphakathi kwamajeneretha amakhulu asekelwe kumathokheni afana ne-MaskGIT kanye nama-multimodal transformer Teams ngokuvamile athola imiphumela engcono uma echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Izingozi & Guardrails

!

Amalungelo ezithombe kanye nemvume kungaba ubungozi bezomthetho uma ukuvela kungacacile.

!

Ukusebenza kwemodeli kungahluka kukho konke ukukhanya, izibalo zabantu, kanye nezindawo.

!

Okuhle okungelona iqiniso kungase kungabonakali ngaphandle uma izinga lokuzethemba liqashelwa.

Ukuqalisa Umhlahlandlela

1

Chaza indlela yokwamukela yokunemba, ukukhumbula, nezindleko zamaphutha.

Chaza indlela yokwamukela yokunemba, ukukhumbula, nezindleko zamaphutha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

2

Hlola ngedatha efana nezimo zangempela zokukhiqiza.

Hlola ngedatha efana nezimo zangempela zokukhiqiza. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

3

Engeza isibuyekezo somuntu ukuze uthole ukuzethemba okuphansi noma izibikezelo zomthelela omkhulu.

Engeza isibuyekezo somuntu ukuze uthole ukuzethemba okuphansi noma izibikezelo zomthelela omkhulu. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

4

Landelela ukukhukhuleka kwemodeli bese uqinisekisa kabusha ngemva kwezinguquko zekhamera noma zesethi yedatha.

Landelela ukukhukhuleka kwemodeli bese uqinisekisa kabusha ngemva kwezinguquko zekhamera noma zesethi yedatha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

Qhubeka Uhlole