I-VISual AI GUIDE

Ama-Autoencoder Afihliwe

I-Masked Autoencoder (MAE) iyindlela eziqondisayo efundisa imodeli yombono ukwakha kabusha izithombe ngemva kokuba isithombe esiningi sifihliwe.

Uhlolojikelele

I-Masked Autoencoder (MAE) iyindlela eziqondisayo efundisa imodeli yombono ukwakha kabusha izithombe ngemva kokuba isithombe esiningi sifihliwe. Ngokufunda ukugcwalisa izikhala, imodeli yakha ukuqonda okucebile okubonakalayo ngaphandle kwamalebula abantu.

Ama-Autoencoder Afihliwe ayingxenye yokugeleza komsebenzi okubonwa ngekhompyutha okuhumusha noma okukhiqiza imidiya ebonakalayo ukuze ihlaziywe, isebenze, futhi isungulwe.

I-Deep Dive

Ama-Autoencoder Emaski, ethulwa u-Kaiming He kanye nozakwabo ku-Meta AI ngo-2021, bathatha isithombe, basihlukanise sibe iziqephu ezincane, futhi bafihle ngokungahleliwe ingxenye enkulu kakhulu, ngokuvamile ngo-75%. Isishumeki se-Vision Transformer sicubungula amapeshi abonakalayo kuphela, kuyilapho isiqophi sekhodi esingasindi sizama ukwakha kabusha amaphikseli asekuqaleni angekho. Ngenxa yokuthi kuningi okufihliwe, imodeli ayikwazi ukukopisha amaphikseli aseduze futhi kufanele ifunde ukwakheka okunenjongo, njengamajamo nezingxenye zento. Isifaki khodi sokweqa amapeshi afihliwe senza ukuqeqeshwa kusheshe futhi kusebenze kahle inkumbulo. Ngemva kokuqeqeshwa kusengaphambili, isikhiphi khodi siyalahlwa futhi isifaki khodi sidluliswa ngokuqinile ekuhlukaniseni, ekutholeni, nasemisebenzini yokuhlukanisa.

I-Technical Insight

Iqhinga eliyisihluthulelo liyi-asymmetry: isifaki khodi esindayo sibona kuphela amapeshi angama-25% angafihliwe, kuyilapho isikhiphi khodi esincane sakha kabusha okunye. Amapeshi ayisicaba, ashumekwe ngomugqa, futhi anikezwe amakhodi wokuma. Ukulahleka kokwakha kabusha kusho iphutha eliyisikwele elibalwa kuphela kumapeshi afihliwe, ngokuvamile ngamavelu ephikseli ajwayelekile. Izilinganiso zokufihla ubuso eziphezulu ziphoqelela ukufunda kwe-semantic esikhundleni sokuhumusha kwezinga eliphansi, kanye nokweqa amathokheni afihliweyo ekusikeni kwesishumeki sekhodi kubala ngokuphawulekayo ngokumelene nokucubungula isithombe esigcwele.

I-Mastering Autoencoder Efihliwe

I-Masked Autoencoder (MAE) iyindlela eziqondisayo efundisa imodeli yombono ukwakha kabusha izithombe ngemva kokuba isithombe esiningi sifihliwe. Ngokufunda ukugcwalisa izikhala, imodeli yakha ukuqonda okucebile okubonakalayo ngaphandle kwamalebula abantu. Ama-Autoencoder Afihliwe ayingxenye yokugeleza komsebenzi okubonwa ngekhompyutha okuhumusha noma okukhiqiza imidiya ebonakalayo ukuze ihlaziywe, isebenze, futhi isungulwe. Ukuze wakhe ukuqonda okujulile, phatha ama-Autoencoder Masked njengemodeli yokusebenza, hhayi isici esisodwa: chaza imiphumela oyifunayo, ucacise ukucabanga, futhi uhlukanise lokho isistimu engakwenza ngokwethembeka kulokho okusadinga ukwahlulela kochwepheshe.

Empeleni, amaqembu aqinile asebenzisa ukunemba kwebhalansi ye-Masked Autoencoder namaqiniso okusebenza njengekhwalithi yedatha, ukuhluka kokukhanya, nokuvumelana kwamalebula. Babhala imibandela yempumelelo ecacile, ukuhlola okuqhathaniswa nedatha engokoqobo nokugeleza komsebenzi, futhi baphindaphinde ngokusekelwe kumaphethini okuhluleka aqashiwe esikhundleni sokuwina kwebhentshimakhi yesikhathi esisodwa. Yilapho ukuqonda kwethiyori kuguquka kube amandla ahlala njalo kuwo wonke umkhiqizo, inqubomgomo, kanye nokusebenza.

I-Visual AI ingakwazi ukuhlola, ukutholwa, nokumaka imisebenzi esikalini. Ngesikhathi esifanayo, amalungelo ezithombe kanye nemvume kungaba ubungozi bomthetho uma ukutholakala kungacacile. Indlela eqine kakhulu iwukuhlanganisa isivinini sokuhlola nesiyalo sokuphatha: qhuba abashayeli bezindiza, bamba ubufakazi, ushicilele amalogi ezinqumo, futhi ubuyekeze izivikelo ngokuqhubekayo njengoba imodeli yokuziphatha, okulindelwe ngabasebenzisi, kanye nezimfuneko zokulawula zishintsha.

I-Strategic Impact

I-Visual AI ingakwazi ukuhlola, ukutholwa, nokumaka imisebenzi esikalini.

I-Visual AI ingakwazi ukuhlola, ukutholwa, nokumaka imisebenzi esikalini. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Amathimba aqanjiwe angakwazi ukulinganisa imiqondo ngokushesha ngezibuyekezo ezimbalwa ezenziwa mathupha.

Amathimba aqanjiwe angakwazi ukulinganisa imiqondo ngokushesha ngezibuyekezo ezimbalwa ezenziwa mathupha. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Imisebenzi ingasebenzisa amasiginali wesithombe nawevidiyo obekunzima ukuwenza ngaphambilini.

Imisebenzi ingasebenzisa amasiginali wesithombe nawevidiyo obekunzima ukuwenza ngaphambilini. Ekusetshenzisweni kwekhwalithi ephezulu, lokhu kuhunyushwa emithethweni yokusebenza elinganisekayo, imingcele yobunikazi, nemikhuba yokubuyekeza ephindelelayo ukuze amaqembu akwazi ukukala ukuzethemba esikhundleni sokukala ukungaqondakali.

Ikusasa lama-Autoencoder Afihliwe

Ukwakhiwa kabusha kwesifihlwa kwesitayela se-MAE kuba iresiphi yokuqeqesha ezenzakalelayo kuzo zonke izindlela. Abacwaningi bayelulela kuvidiyo (efihla amakhyubhu esikhathi sasemkhathini), ama-audio spectrogram, izikena zezokwelapha, nemifanekiso yesathelayithi, lapho amalebula eyivelakancane futhi emba eqolo. Lindela ukuhlanganiswa okuqinile nolimi lwamamodeli esisekelo se-multimodal, amadekhoda asebenza kahle kakhudlwana, nokufihla okuguquguqukayo okuqondise izifunda ezifundisayo. Njengoba ikhompuyutha ikhula, ukuzijwayeza okufihliwe emaqoqweni amakhulu ezithombe ezingenamalebula kufanele kuqhubeke kuthuthukisa ukunemba komfula kuyilapho kunciphisa ukuthembela kusichasiselo esibizayo sabantu.

Ukuqaliswa Komhlaba Wangempela

Ukuqeqesha kusengaphambili i-Vision Transformer ezigidini zezithombe ezingenalebula, bese uyilungisela ukuhlukaniswa kwe-ImageNet ngokunemba okuqinile.

Izici zokufunda eziskena zezokwelapha ezingenalebuli (ama-X-reyi, ama-MRIs) lapho isichasiselo sochwepheshe sibiza futhi silinganiselwe

Ukulungisa indlela ibe yividiyo ngokuvala iziqephu zesikhathi sasemkhathini ukuze uqeqeshe kusengaphambili amamodeli wokuqaphela isenzo (VideoMAE)

Ukuziqeqesha kusengaphambili ngesithombe sesathelayithi nesemoyeni ukuze kusekelwe imephu yokusetshenziswa komhlaba kanye nokushintsha ukutholwa ngaphandle kwamalebula okwenziwa ngesandla

Amaphethini Okusebenzisa

Ama-Autoencoder Masked ayasebenza

Ukuqeqesha kusengaphambili i-Vision Transformer ezigidini zezithombe ezingenamalebula, bese uyilungisela ukuhlukaniswa kwe-ImageNet ngokunemba okuqinile.

Ukuqeqesha kusengaphambili i-Vision Transformer ezigidini zezithombe ezingenamalebula, bese uyilungisela ukuhlukaniswa kwe-ImageNet ngokunemba okuqinile Amathimba ngokuvamile athola imiphumela engcono uma echaza izilinganiso zekhwalithi ngaphambili, agcine indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Ama-Autoencoder Masked ayasebenza

Izici zokufunda eziskena zezokwelapha ezingenalebuli (ama-X-reyi, ama-MRIs) lapho isichasiselo sochwepheshe sibiza futhi silinganiselwe.

Izici zokufunda eziskena zezokwelapha ezingenalo ilebula (ama-X-reyi, ama-MRIs) lapho isichasiselo sochwepheshe sibiza futhi silinganiselwe Amathimba ngokuvamile athola imiphumela engcono lapho echaza imikhawulo yekhwalithi ngaphambili, egcina indlela yokukhuphuka komuntu yezimo ezibucayi, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Ama-Autoencoder Masked ayasebenza

Ukulungisa indlela ibe yividiyo ngokuvala iziqephu zesikhathi sasemkhathini ukuze uqeqeshe kusengaphambili amamodeli okuqaphela isenzo (VideoMAE).

Ukujwayela indlela eya kuvidiyo ngokuvala amapheshana esikhathi sasemkhathini ukuze aqeqeshe kusengaphambili amamodeli okuqaphela isenzo (VideoMAE) Amaqembu ngokuvamile athola imiphumela engcono uma echaza imingcele yekhwalithi ngaphambili, egcina indlela yokukhuphuka yomuntu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Ama-Autoencoder Masked ayasebenza

Ukuziqeqesha kusengaphambili ngesithombe sesathelayithi nesemoyeni ukuze kusekelwe imephu yokusetshenziswa komhlaba kanye nokushintsha ukutholwa ngaphandle kwamalebula okwenziwa ngesandla.

Ukuziqeqesha kusengaphambili ngesithombe sesathelayithi nesemoyeni ukuze kusekelwe ukuklanywa kokusetshenziswa komhlaba kanye nokutholwa koshintsho ngaphandle kwamalebula okwenziwa ngesandla Amaqembu ngokuvamile athola imiphumela engcono uma echaza izilinganiso zekhwalithi ngaphambili, agcina indlela yokukhuphuka yabantu yamakesi asemaphethelweni, futhi alandelele kokubili izinzuzo zokukhiqiza nezindleko zamaphutha ngokuhamba kwesikhathi.

Izingozi & Guardrails

!

Amalungelo ezithombe kanye nemvume kungaba ubungozi bezomthetho uma ukuvela kungacacile.

!

Ukusebenza kwemodeli kungahluka kukho konke ukukhanya, izibalo zabantu, kanye nezindawo.

!

Okuhle okungelona iqiniso kungase kungabonakali ngaphandle uma izinga lokuzethemba liqashelwa.

Ukuqalisa Umhlahlandlela

1

Chaza indlela yokwamukela yokunemba, ukukhumbula, nezindleko zamaphutha.

Chaza indlela yokwamukela yokunemba, ukukhumbula, nezindleko zamaphutha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

2

Hlola ngedatha efana nezimo zangempela zokukhiqiza.

Hlola ngedatha efana nezimo zangempela zokukhiqiza. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

3

Engeza isibuyekezo somuntu ukuze uthole ukuzethemba okuphansi noma izibikezelo zomthelela omkhulu.

Engeza isibuyekezo somuntu ukuze uthole ukuzethemba okuphansi noma izibikezelo zomthelela omkhulu. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

4

Landelela ukukhukhuleka kwemodeli bese uqinisekisa kabusha ngemva kwezinguquko zekhamera noma zesethi yedatha.

Landelela ukukhukhuleka kwemodeli bese uqinisekisa kabusha ngemva kwezinguquko zekhamera noma zesethi yedatha. Phatha isinyathelo ngasinye njengesango lobufakazi: uma imibandela ingafinyelelwa, misa ukukhishwa, vala igebe, bese unweba ukusetshenziswa.

Qhubeka Uhlole