논문 링크 : https://arxiv.org/abs/2306.13549
A Survey on Multimodal Large Language Models
Recently, Multimodal Large Language Model (MLLM) represented by GPT-4V has been a new rising research hotspot, which uses powerful Large Language Models (LLMs) as a brain to perform multimodal tasks. The surprising emergent capabilities of MLLM, such as wr
arxiv.org