일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | |||
5 | 6 | 7 | 8 | 9 | 10 | 11 |
12 | 13 | 14 | 15 | 16 | 17 | 18 |
19 | 20 | 21 | 22 | 23 | 24 | 25 |
26 | 27 | 28 | 29 | 30 | 31 |
- MLLM
- 모델 구조 변경
- mqtt
- Multimodal Large Language Model
- 가중치 없이 모델 로드
- cnn
- 논문 작성 요령
- def train
- Foundation Transformers
- layer 추출
- 논문 작성
- vsocde 익스텐션
- 파라미터 수
- 주식 용어정리
- Instruction dataset
- 특정 layer 추출
- 주식
- 파라미터 수 확인
- def validation
- 모델 동결
- 가상환경
- mPLUG-2
- KOSMOS-2
- Video Understanding
- 강화학습
- pretrained model layer
- DeepNet
- 모델 freeze
- 논문리뷰
- 특정 layer 동결
- Today
- Total
목록논문 리뷰 (14)
시작은 미약하였으나 , 그 끝은 창대하리라
논문 링크 : https://arxiv.org/abs/2103.13313 In-flight positional and energy use data set of a DJI Matrice 100 quadcopter for small package delivery We autonomously direct a small quadcopter package delivery Uncrewed Aerial Vehicle (UAV) or "drone" to take off, fly a specified route, and land for a total of 209 flights while varying a set of operational parameters. The vehicle was equipped with onbo..
논문 링크 : https://www.mdpi.com/2072-4292/15/8/2139 CapERA: Captioning Events in Aerial Videos In this paper, we introduce the CapERA dataset, which upgrades the Event Recognition in Aerial Videos (ERA) dataset to aerial video captioning. The newly proposed dataset aims to advance visual–language-understanding tasks for UAV videos by providing eac www.mdpi.com Published : 2023.04.18 (MDPI- Remote S..
논문링크: https://arxiv.org/abs/2206.06488 Multimodal Learning with Transformers: A Survey Transformer is a promising neural network learner, and has achieved great success in various machine learning tasks. Thanks to the recent prevalence of multimodal applications and big data, Transformer-based multimodal learning has become a hot topic in AI arxiv.org ❏ 글 목차 (논문 목차 아님, 중요하고 필요로 하는 정보만 읽고 정리함) : ..
class SwinTransformer(nn.Module): r""" Swin Transformer A PyTorch impl of : `Swin Transformer: Hierarchical Vision Transformer using Shifted Windows` - https://arxiv.org/pdf/2103.14030 Args: img_size (int | tuple(int)): Input image size. Default 224 patch_size (int | tuple(int)): Patch size. Default: 4 in_chans (int): Number of input image channels. Default: 3 num_classes (int): Number of classe..
논문 링크 : https://openaccess.thecvf.com/content/ICCV2021/html/Liu_Swin_Transformer_Hierarchical_Vision_Transformer_Using_Shifted_Windows_ICCV_2021_paper ICCV 2021 Open Access Repository Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo; Proceedings of the IEEE/CVF International Conference o..