I-VISual AI GUIDE

I-CogVideo ne-CogVideoX

I-CogVideo (2022) yayiyimodeli yokuqala enkulu evulekile yombhalo-kuya-vidiyo, futhi i-CogVideoX (2024) ingumlandeli wayo ovuleleke kakhulu okwazi ukuvela eTsinghua/Zhipu AI.

Uhlolojikelele

I-CogVideo (2022) yayiyimodeli yokuqala enkulu evulekile yombhalo-kuya-vidiyo, futhi i-CogVideoX (2024) ingumlandeli wayo ovuleleke kakhulu okwazi ukuvela eTsinghua/Zhipu AI. Babalulekile ngoba babeka ukukhiqizwa kwevidiyo yekhwalithi ephezulu ezandleni zomphakathi ovulekile, hhayi nje amalebhu amakhulu ezinkampani.

I-CogVideo ne-CogVideoX ingeyokugeleza komsebenzi wokubona ngekhompyutha okuhumusha noma okukhiqiza imidiya ebonakalayo ukuze ihlaziywe, isebenze, futhi isungule.

I-Deep Dive

I-CogVideo, eyakhishwa ngo-2022, yakhelwe phezu kwe-CogView2 text-to-image transformer futhi yasebenzisa indlela enezinhlaka eziningi, indlela ezenzakalelayo ukuze ikhiqize iziqeshana ezimfushane, ibe imodeli yokuqala enkulu ekhishwa ngokusobala yokuthumela umbhalo-kuya-vidiyo futhi esekela imiyalo yesiShayina nesiNgisi. Umlandeli wayo wango-2024, i-CogVideoX, iklanywe kabusha ngokuphelele: isebenzisa i-3D causal variational autoencoder ukuze iminyanise ividiyo endaweni nesikhathi, bese kuba I-Expert Transformer enenjongo yokusabalalisa ethamela ngokuhlanganyela phezu kombhalo namathokheni wevidiyo ahlanganiswe ndawonye. Amamodeli e-CogVideoX (ngosayizi ofana namapharamitha angu-2B kanye no-5B) akhiqiza imizuzwana embalwa yevidiyo ehambisanayo, enyakazayo ngezinqumo ezinjengo-720x480 futhi isekela isithombe ukuya kuvidiyo nokuqhubeka kwevidiyo. Okubi kakhulu, izisindo nekhodi kusesidlangalaleni, okubhebhezela igagasi lamaculo amahle omphakathi, amathuluzi, nocwaningo.

I-Technical Insight

I-CogVideoX's 3D causal VAE incipha ividiyo eluhlaza ibe yivolumu efihlekile ehlangene, inciphise inani lamathokheni ukuze isiguquli sikwazi ukumodela ukulandelana okude ngendlela ethengekayo. I-Expert Transformer isebenzisa inkambiso yesendlalelo esiguquguqukayo futhi ihlanganisa umbhalo namathokheni abonakalayo ukuze lezi zindlela ezimbili zibhekane ngqo, zithuthukise ukuqondana kombhalo nevidiyo. Ukuqeqeshwa okuqhubekayo ekukhuliseni izinqumo nobude besikhathi, kanye namagama-ncazo wedatha acophelelayo, kuletha ukunyakaza okushelelayo, okuthembekile ngokwezibalo.

I-Mastering CogVideo ne-CogVideoX

I-CogVideo (2022) yayiyimodeli yokuqala enkulu evulekile yombhalo-kuya-vidiyo, futhi i-CogVideoX (2024) ingumlandeli wayo ovuleleke kakhulu okwazi ukuvela eTsinghua/Zhipu AI. Babalulekile ngoba babeka ukukhiqizwa kwevidiyo yekhwalithi ephezulu ezandleni zomphakathi ovulekile, hhayi nje amalebhu amakhulu ezinkampani. I-CogVideo ne-CogVideoX ingeyokugeleza komsebenzi wokubona ngekhompyutha okuhumusha noma okukhiqiza imidiya ebonakalayo ukuze ihlaziywe, isebenze, futhi isungule. Ukuze wakhe ukuqonda okujulile, phatha i-CogVideo ne-CogVideoX njengemodeli yokusebenza, hhayi isici esisodwa: chaza imiphumela efiselekayo, ucacise ukucabanga, futhi uhlukanise lokho uhlelo olungakwenza ngokwethembeka kulokho okusadinga ukwahlulela kochwepheshe.

Empeleni, amaqembu aqinile asebenzisa ukunemba kwebhalansi ye-CogVideo ne-CogVideoX namaqiniso okusebenza njengekhwalithi yedatha, ukuhluka kokukhanya, nokuvumelana kwamalebula. Babhala imibandela yempumelelo ecacile, ukuhlola okuqhathaniswa nedatha engokoqobo nokugeleza komsebenzi, futhi baphindaphinde ngokusekelwe kumaphethini okuhluleka aqashiwe esikhundleni sokuwina kwebhentshimakhi yesikhathi esisodwa. Yilapho ukuqonda kwethiyori kuguquka kube amandla ahlala njalo kuwo wonke umkhiqizo, inqubomgomo, kanye nokusebenza.

I-Visual AI ingakwazi ukuhlola, ukutholwa, nokumaka imisebenzi esikalini. Ngesikhathi esifanayo, amalungelo ezithombe kanye nemvume kungaba ubungozi bomthetho uma ukutholakala kungacacile. Indlela eqine kakhulu iwukuhlanganisa isivinini sokuhlola nesiyalo sokuphatha: qhuba abashayeli bezindiza, bamba ubufakazi, ushicilele amalogi ezinqumo, futhi ubuyekeze izivikelo ngokuqhubekayo njengoba imodeli yokuziphatha, okulindelwe ngabasebenzisi, kanye nezimfuneko zokulawula zishintsha.

I-Strategic Impact

I-Visual AI ingakwazi ukuhlola, ukutholwa, nokumaka imisebenzi esikalini.

I-Visual AI ingakwazi ukuhlola, ukutholwa, nokumaka imisebenzi esikalini. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Amathimba aqanjiwe angakwazi ukulinganisa imiqondo ngokushesha ngezibuyekezo ezimbalwa ezenziwa mathupha.

Amathimba aqanjiwe angakwazi ukulinganisa imiqondo ngokushesha ngezibuyekezo ezimbalwa ezenziwa mathupha. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Imisebenzi ingasebenzisa amasiginali wesithombe nawevidiyo obekunzima ukuwenza ngaphambilini.

Imisebenzi ingasebenzisa amasiginali wesithombe nawevidiyo obekunzima ukuwenza ngaphambilini. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Ikusasa le-CogVideo ne-CogVideoX

Njengenye yamamodeli evidiyo anamandla avulekile, i-CogVideoX isekela i-ecosystem ekhula ngokushesha yamashuni amahle, ama-adaptha okulawula, nezandiso zesikhathi eside. Lindela izinzuzo eziqhubekayo ngobude besiqeshana, ukulungiswa, ukuba ngokoqobo kokunyakaza, nokulawulwa, kanye nokuhlanganiswa okuqinile nesithombe-kuya-kuvidiyo nokugeleza komsebenzi wokuhlela. Izisindo zayo ezivulekile zisho izinhlangano ezingenzi nzuzo, abacwaningi, nezitudiyo ezincane zingakha esizukulwaneni sevidiyo sesigaba esisemngceleni ngaphandle kokugcinwa kwesango lokuphathelene, kusheshiswe kokubili ukuhlola kokudala nokugxile ekuphepheni.

Ukuqaliswa Komhlaba Wangempela

Ikhiqiza isiqeshana esilandisayo esifushane kusuka kumyalo wesiShayina noma wesiNgisi kusetshenziswa izisindo ezivuleke ngokugcwele

Ukuguqula isithombe esisodwa esimile esilayishiwe sibe yividiyo ehambayo ngesithombe se-CogVideoX siye kuvidiyo

Ukulungisa kahle imodeli evuliwe kusitayela sangokwezifiso noma uhlamvu lokugqwayiza kwe-indie

Abacwaningi baqhathanisa izindlela ezintsha zokukhiqiza ividiyo ngokumelene nesisekelo esivulelekayo esiphindayo

Amaphethini Okusebenzisa

I-CogVideo ne-CogVideoX isebenza

Ikhiqiza isiqeshana esilandisayo esifushane kusuka kumyalo wesiShayina noma wesiNgisi kusetshenziswa izisindo ezivuleke ngokugcwele.

Ukukhiqiza isiqeshana esilandisayo esifushane ngomyalo wesiShayina noma wesiNgisi kusetshenziswa izisindo ezivuleke ngokugcwele Amaqembu ngokuvamile athola imiphumela engcono uma echaza izilinganiso zekhwalithi ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi alandelela kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-CogVideo ne-CogVideoX isebenza

Ukuguqula isithombe esisodwa esimile esilayishiwe sibe yividiyo ehambayo ngesithombe se-CogVideoX siye kuvidiyo.

Ukuguqula isithombe esimile esilayishiwe sibe yividiyo ehambayo nge-CogVideoX Amathimba wesithombe-kuya-kuvidiyo ngokuvamile athola imiphumela engcono uma echaza imingcele yekhwalithi ngaphambili, egcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi elandelela kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-CogVideo ne-CogVideoX isebenza

Ukulungisa kahle imodeli evuliwe kusitayela sangokwezifiso noma uhlamvu lokugqwayiza kwe-indie.

Ukushuna kahle imodeli evuliwe kusitayela sangokwezifiso noma uhlamvu lwe-indie animation Amaqembu ngokuvamile athola imiphumela engcono uma echaza izinga eliphezulu ngaphambili, egcina indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

I-CogVideo ne-CogVideoX isebenza

Abacwaningi baqhathanisa izindlela ezintsha zokukhiqiza ividiyo ngokumelene nesisekelo esivulelekayo esiphindayo.

Abacwaningi abalinganisa izindlela ezintsha zokukhiqiza amavidiyo ngokumelene nesisekelo esivulekile Amathimba ngokuvamile athola imiphumela engcono uma echaza imingcele yekhwalithi ngaphambili, egcina indlela yokukhuphuka yomuntu yamacala asemaphethelweni, futhi alandelela kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Izingozi & Guardrails

!

Amalungelo ezithombe kanye nemvume kungaba ubungozi bezomthetho uma ukuvela kungacacile.

!

Ukusebenza kwemodeli kungahluka kukho konke ukukhanya, izibalo zabantu, kanye nezindawo.

!

Okuhle okungelona iqiniso kungase kungabonakali ngaphandle uma izinga lokuzethemba liqashelwa.

Ukuqalisa Umhlahlandlela

1

Chaza indlela yokwamukela yokunemba, ukukhumbula, nezindleko zamaphutha.

Chaza indlela yokwamukela yokunemba, ukukhumbula, nezindleko zamaphutha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

2

Hlola ngedatha efana nezimo zangempela zokukhiqiza.

Hlola ngedatha efana nezimo zangempela zokukhiqiza. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

3

Engeza isibuyekezo somuntu ukuze uthole ukuzethemba okuphansi noma izibikezelo zomthelela omkhulu.

Engeza isibuyekezo somuntu ukuze uthole ukuzethemba okuphansi noma izibikezelo zomthelela omkhulu. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

4

Landelela ukukhukhuleka kwemodeli bese uqinisekisa kabusha ngemva kwezinguquko zekhamera noma zesethi yedatha.

Landelela ukukhukhuleka kwemodeli bese uqinisekisa kabusha ngemva kwezinguquko zekhamera noma zesethi yedatha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

Qhubeka Uhlole