Multimodal Pretraining
Investigate pretraining strategies for models combining language, vision, and other modalities.
CLIP, DALL·E, contrastive learning, image-text alignment, cross-modal embeddings, multimodal datasets
Investigate pretraining strategies for models combining language, vision, and other modalities.
CLIP, DALL·E, contrastive learning, image-text alignment, cross-modal embeddings, multimodal datasets