'multimodal' 태그의 글 목록

Notice

Recent Posts

Link

« 2026/01 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Tags more

관리 메뉴

목록multimodal (4)

정화 코딩

[논문 리뷰] ImageBind: One Embedding Space To Bind Them All

https://arxiv.org/abs/2305.05665 ImageBind: One Embedding Space To Bind Them AllWe present ImageBind, an approach to learn a joint embedding across six different modalities - images, text, audio, depth, thermal, and IMU data. We show that all combinations of paired data are not necessary to train such a joint embedding, and only imagearxiv.org 1. Introduction아이디어: 이미지의 결합(binding) 능력 -> 다양한 센서와 ..

AI 2025. 7. 24. 02:39

[논문 리뷰] AudioCLIP: Extending CLIP to Image, Text and Audio

https://arxiv.org/abs/2106.13043 AudioCLIP: Extending CLIP to Image, Text and AudioIn the past, the rapidly evolving field of sound classification greatly benefited from the application of methods from other domains. Today, we observe the trend to fuse domain-specific tasks and approaches together, which provides the community with new oarxiv.org 1. Introduction- 오디오 분류 분야의 발전. But, 이전까지는 오직 오디오..

AI 2025. 7. 23. 17:12

MDETR 모델 주요 코드 분석

논문: https://arxiv.org/abs/2104.12763 MDETR -- Modulated Detection for End-to-End Multi-Modal UnderstandingMulti-modal reasoning systems rely on a pre-trained object detector to extract regions of interest from the image. However, this crucial module is typically used as a black box, trained independently of the downstream task and on a fixed vocabulary of objearxiv.org깃허브(코드): https://github.com..

AI 2025. 7. 18. 13:54

[논문 리뷰] MDETR - Modulated Detection for End-to-End Multi-Modal Understanding

https://arxiv.org/abs/2104.12763 MDETR -- Modulated Detection for End-to-End Multi-Modal UnderstandingMulti-modal reasoning systems rely on a pre-trained object detector to extract regions of interest from the image. However, this crucial module is typically used as a black box, trained independently of the downstream task and on a fixed vocabulary of objearxiv.org 0. AbstractMulti-modal reasoni..

AI 2025. 7. 3. 11:52

이전 Prev 1 Next 다음

목록multimodal (4)

정화 코딩

티스토리툴바