일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | |||
5 | 6 | 7 | 8 | 9 | 10 | 11 |
12 | 13 | 14 | 15 | 16 | 17 | 18 |
19 | 20 | 21 | 22 | 23 | 24 | 25 |
26 | 27 | 28 | 29 | 30 | 31 |
- DeepNet
- Multimodal Large Language Model
- 가중치 없이 모델 로드
- Video Understanding
- 모델 freeze
- Foundation Transformers
- 파라미터 수
- mqtt
- Instruction dataset
- 특정 layer 추출
- KOSMOS-2
- 주식
- 3C4P
- layer 추출
- 논문리뷰
- 특정 layer 동결
- 가상환경
- cnn
- def train
- 파라미터 수 확인
- 논문 작성 요령
- 주식 용어정리
- 모델 구조 변경
- MLLM
- 모델 동결
- 강화학습
- pretrained model layer
- mPLUG-2
- def validation
- 논문 작성
- Today
- Total
목록MLLM (2)
시작은 미약하였으나 , 그 끝은 창대하리라
논문링크 : https://arxiv.org/abs/2306.14824 Kosmos-2: Grounding Multimodal Large Language Models to the World We introduce Kosmos-2, a Multimodal Large Language Model (MLLM), enabling new capabilities of perceiving object descriptions (e.g., bounding boxes) and grounding text to the visual world. Specifically, we represent refer expressions as links in Markdown, i arxiv.org Published : 2023.07 (arXi..
논문 링크: https://arxiv.org/abs/2302.14045 Language Is Not All You Need: Aligning Perception with Language Models A big convergence of language, multimodal perception, action, and world modeling is a key step toward artificial general intelligence. In this work, we introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn arxiv.org Published : 2023.03 ..