Visual AI GUIDE

Nchịkọta onyonyo VQGAN na Codebook

VQGAN na-akwakọba onyogho n'ime grid nke akara ngosi dị iche nke ewepụtara na koodu codebook amụtara, na-ahapụ onye ntụgharị ka ọ mepụta onyonyo otu ụdị asụsụ si ewepụta ederede.

Nchịkọta

VQGAN na-akwakọba onyogho n'ime grid nke akara ngosi dị iche nke ewepụtara na koodu codebook amụtara, na-ahapụ onye ntụgharị ka ọ mepụta onyonyo otu ụdị asụsụ si ewepụta ederede.

VQGAN na Codebook Image Synthesis bụ nke na-arụ ọrụ n'ọhụụ kọmputa nke na-akọwa ma ọ bụ na-emepụta mgbasa ozi anya maka nyocha, ọrụ na imepụta ihe.

Ime miri emi

VQGAN, ewebata na 2021 akwụkwọ 'Taming Transformers for High-Resolution Image Synthesis,' na-ejikọta vector-quantized autoencoder (VQVAE) na ọzụzụ mmegide na nghọta. Ihe ntinye koodu na-ese onyinyo n'obere grid nke vectors atụmatụ; A na-etinye vector ọ bụla na ntinye kacha nso n'ime akwụkwọ koodu amụtara, sịnụ, koodu 1024 pụrụ iche, na-atụgharị onyonyo ka ọ bụrụ usoro nke akara integer. Onye nrụpụta na-ewughachi onyonyo site na akara ngosi ndị ahụ, zụrụ ya site na ịkpa oke GAN na nhụsianya n'ihi na nrụgharị ahụ na-adị nkọ karịa ka ọ dị nkọ. N'ihi na onyonyo ugbu a bụ usoro akara ngosi pụrụ iche, ihe ntụgharị autoregressive nwere ike ịṅomi ha dị ka asụsụ, na-ebu amụma n'otu n'otu. VQGAN kwadoro ngwa ọrụ nka ederede gaa na onyonyo mgbe ejikọtara ya na ntuziaka CLIP.

Nghọta nka nka

Ọrụ bụ isi bụ ọnụọgụ vector: a na-eji vectors codebook kacha nso nọchiri mpụta ihe ngbanwe na-aga n'ihu, na-eji ihe nleba anya 'straight-through' gradient estimator nke mere na koodu ahụ ka nwere ike ịmụta n'agbanyeghị nleba anya na-enweghị iche. Ịgbakwunye onye ịkpa ókè GAN dabere na patch n'elu autoencoder bụ ihe na-ahapụ VQGAN iji grid token pere mpe (dịka 16 × 16) karịa VQVAE ka ọ na-edobe textures crisp, na-eme ka ihe ngosi nke ntụgharị.

Ịkwalite VQGAN na nchịkọta onyonyo Codebook

VQGAN na-akwakọba onyogho n'ime grid nke akara ngosi dị iche nke ewepụtara na koodu codebook amụtara, na-ahapụ onye ntụgharị ka ọ mepụta onyonyo otu ụdị asụsụ si ewepụta ederede. VQGAN na Codebook Image Synthesis bụ nke na-arụ ọrụ n'ọhụụ kọmputa nke na-akọwa ma ọ bụ na-emepụta mgbasa ozi anya maka nyocha, ọrụ na imepụta ihe. Iji wulite nghọta miri emi, na-emeso VQGAN na Codebook Image Synthesis dị ka ihe nlereanya na-arụ ọrụ, ọ bụghị otu akụkụ: kọwaa nsonaazụ achọrọ, dokwuo anya echiche, ma kewaa ihe sistemụ nwere ike ime nke ọma na ihe ka na-achọ mkpebi ndị ọkachamara.

Na omume, ndị otu siri ike na-eji VQGAN na Codebook Image Synthesis itule ziri ezi na eziokwu arụ ọrụ dị ka ogo data, iche iche ọkụ, na ịdekọ aha. Ha na-edepụta njirisi ịga nke ọma nke ọma, nwalee megide data ziri ezi yana usoro ọrụ, yana na-atụgharị dabere na usoro ọdịda ahụrụ karịa karịa mmeri otu oge. Nke a bụ ebe nghọta usoro ihe atụ na-atụgharị ka ọ bụrụ ike na-adịgide adịgide gafee ngwaahịa, amụma na arụmọrụ.

Visual AI nwere ike megharịa nyocha, nchọpụta na mkpado ọrụ n'ọtụtụ. N'otu oge ahụ, ikike onyonyo na nkwenye nwere ike bụrụ ihe egwu iwu ma ọ bụrụ na edoghị anya. Ụzọ kachasị na-agbanwe agbanwe bụ ijikọ ọsọ nnwale na ịdọ aka ná ntị ọchịchị: ndị na-anya ụgbọ elu, ijide ihe akaebe, bipụta ndekọ mkpebi, na na-aga n'ihu na-emelite nchekwa dị ka omume nlereanya, atụmanya ndị ọrụ, na ihe iwu chọrọ.

Mmetụta Strategic

Visual AI nwere ike megharịa nyocha, nchọpụta na mkpado ọrụ n'ọtụtụ.

Visual AI nwere ike megharịa nyocha, nchọpụta na mkpado ọrụ n'ọtụtụ. N'ịkwanye ọkwa dị elu, a na-atụgharị nke a ka ọ bụrụ iwu arụ ọrụ enwere ike ịtụnye, oke nwe, na emume ntụlegharị ugboro ugboro ka ndị otu wee nwee ike ịbawanye ntụkwasị obi kama iwelite enweghị mgbagha.

Otu ndị na-emepụta ihe nwere ike imepụta echiche ngwa ngwa site na ngbanwe akwụkwọ ntuziaka ole na ole.

Otu ndị na-emepụta ihe nwere ike imepụta echiche ngwa ngwa site na ngbanwe akwụkwọ ntuziaka ole na ole. N'ịkwanye ọkwa dị elu, a na-atụgharị nke a ka ọ bụrụ iwu arụ ọrụ enwere ike ịtụnye, oke nwe, na emume ntụlegharị ugboro ugboro ka ndị otu wee nwee ike ịbawanye ntụkwasị obi kama iwelite enweghị mgbagha.

Ọrụ nwere ike iji onyonyo na akara vidiyo siri ike ịhazi.

Ọrụ nwere ike iji onyonyo na akara vidiyo siri ike ịhazi. N'ịkwanye ọkwa dị elu, a na-atụgharị nke a ka ọ bụrụ iwu arụ ọrụ enwere ike ịtụnye, oke nwe, na emume ntụlegharị ugboro ugboro ka ndị otu wee nwee ike ịbawanye ntụkwasị obi kama iwelite enweghị mgbagha.

Ọdịnihu nke VQGAN na Synthesis Image Codebook

Ntụziaka pụrụ iche nke VQGAN ghọrọ ntọala maka onyonyo dabere na token na ụdị vidiyo, site na MaskGIT ruo sistemu multimodal na-agwakọta onyonyo na akara ederede n'otu ngbanwe. Nchọpụta na-akwalite ugbu a gaa n'akwụkwọ codebook buru ibu, nke nwere oke ma ọ bụ nchọta na-enweghị akwụkwọ, nke na-ezere ịdaba akwụkwọ koodu yana gaa n'ụdị ejikọtara ọnụ ebe otu okwu na-agbasa onyonyo, ọdịyo, na asụsụ, na-enyere ọgbọ ọ bụla aka.

Mmejuputa n'ezie n'ụwa

Idonye foto n'ime grid 16x16 nke akara koodu ka onye ntụgharị nwere ike ịdepụta ma megharịa ya.

Ijikọ VQGAN na ntuziaka CLIP iji mepụta nka 'VQGAN+CLIP' AI nke na-efe efe na 2021

Na-akpakọ onyonyo n'ime koodu kọmpat pụrụ iche maka nchekwa dị mma ma ọ bụ ọzụzụ mmepụta ala

Na-eje ozi dị ka ihe onyonyo onyonyo n'ime ndị na-emepụta ihe dabere na token dị ka MaskGIT na ndị na-agbanwe agbanwe.

Usoro mmejuputa

VQGAN na Codebook Image Synthesis na omume

Idonye foto n'ime grid 16x16 nke koodu codebook ka onye ngbanwe nwere ike ịṅomi wee megharịa ya.

Debe foto n'ime grid 16 × 16 nke koodu codebook ka onye na-agbanwe agbanwe wee nwee ike ịmegharị ya ma megharịa ya Otu dị iche iche na-enwetakarị nsonaazụ kacha mma mgbe ha na-akọwapụta ọnụ ụzọ dị mma n'ihu, na-eme ka ụzọ mmadụ si abawanye maka ikpe ikpe, ma soro ma uru nrụpụta yana ụgwọ njehie n'ime oge.

VQGAN na Codebook Image Synthesis na omume

Ijikọ VQGAN na ntuziaka CLIP iji mepụta nka 'VQGAN+CLIP' AI nke na-efe efe na 2021.

Ijikọ VQGAN na ntụzịaka CLIP iji mepụta nka 'VQGAN + CLIP' AI nke malitere viral na 2021 Otu na-enwetakarị nsonaazụ kacha mma mgbe ha kọwapụtara ọnụ ụzọ dị mma n'ihu, debe ụzọ mmụba mmadụ maka ikpe ọnụ, wee soro ma uru nrụpụta yana ụgwọ njehie n'ime oge.

VQGAN na Codebook Image Synthesis na omume

Na-akpakọ onyonyo n'ime koodu kọmpat pụrụ iche maka nchekwa dị mma ma ọ bụ ọzụzụ mmepụta ala.

Ịkọkọ ihe onyonyo n'ime koodu kọmpat dị iche iche maka nchekwa dị mma ma ọ bụ ọzụzụ ọzụzụ na-agbadata Otu dị iche iche na-enweta nsonaazụ kacha mma mgbe ha na-akọwapụta ọnụ ụzọ dị mma n'ihu, na-eme ka ụzọ mmadụ si abawanye maka oke ikpe, ma soro ma uru nrụpụta yana ụgwọ njehie n'ime oge.

VQGAN na Codebook Image Synthesis na omume

Na-eje ozi dị ka ihe onyonyo onyonyo n'ime ndị nrụpụta token buru ibu dị ka MaskGIT na ndị na-agbanwe agbanwe.

Ije ozi dị ka tokenizer ihe onyonyo n'ime ndị na-emepụta ihe dabere na token buru ibu dị ka MaskGIT na ndị na-eme mgbanwe mgbanwe dị iche iche na-enwetakarị nsonaazụ kacha mma mgbe ha na-akọwapụta ọnụ ụzọ dị mma n'ihu, debe ụzọ mmụba mmadụ maka oke ikpe, wee soro ma uru nrụpụta yana ụgwọ njehie n'ime oge.

Ihe ize ndụ & okporo ụzọ nche

!

Ikike onyonyo na nkwenye nwere ike bụrụ ihe egwu dị n'iwu ma ọ bụrụ na edoghị anya.

!

Ọrụ nlereanya nwere ike ịdịgasị iche n'ofe ọkụ, igwe mmadụ, na gburugburu.

!

Enwere ike ghara ịhụ ihe dị mma ma ọ bụrụ na enyochaghị oke ntụkwasị obi.

Map mmejuputa

1

Kọwaa ụkpụrụ nnabata maka nkenke, icheta, na ụgwọ njehie.

Kọwaa ụkpụrụ nnabata maka nkenke, icheta, na ụgwọ njehie. Mesoo nzọụkwụ ọ bụla dị ka ọnụ ụzọ akaebe: ọ bụrụ na emezughị ụkpụrụ, kwụsịtụ mbugharị, mechie oghere ahụ, naanị wee gbasaa ojiji.

2

Nwalee na data dabara na ọnọdụ mmepụta n'ezie.

Nwalee na data dabara na ọnọdụ mmepụta n'ezie. Mesoo nzọụkwụ ọ bụla dị ka ọnụ ụzọ akaebe: ọ bụrụ na emezughị ụkpụrụ, kwụsịtụ mbugharị, mechie oghere ahụ, naanị wee gbasaa ojiji.

3

Tinye nyocha mmadụ maka obere obi ike ma ọ bụ amụma mmetụta dị elu.

Tinye nyocha mmadụ maka obere obi ike ma ọ bụ amụma mmetụta dị elu. Mesoo nzọụkwụ ọ bụla dị ka ọnụ ụzọ akaebe: ọ bụrụ na emezughị ụkpụrụ, kwụsịtụ mbugharị, mechie oghere ahụ, naanị wee gbasaa ojiji.

4

Sochie ihe nlere anya wee megharịa ka emechara mgbanwe igwefoto ma ọ bụ dataset.

Sochie ihe nlere anya wee megharịa ka emechara mgbanwe igwefoto ma ọ bụ dataset. Mesoo nzọụkwụ ọ bụla dị ka ọnụ ụzọ akaebe: ọ bụrụ na emezughị ụkpụrụ, kwụsịtụ mbugharị, mechie oghere ahụ, naanị wee gbasaa ojiji.

Nọgide na-eme nchọpụta