The Grand AI Handbook

Large Multimodal Models

Survey large-scale models integrating multiple modalities for unified tasks.

GPT-4, LLaVA, Flamingo, unified architectures, cross-modal reasoning, multimodal benchmarks