The Grand AI Handbook

Core and Extended Vision Tasks

Chapter 20: Image Classification (Benchmarks: ImageNet, CIFAR; multi-label classification) Chapter 21: Object Detection (YOLO, SSD, RetinaNet, DETR, CenterNet, FCOS) Chapter 22: Semantic Segmentation (FCN, U-Net, DeepLab, HRNet, SegFormer) Chapter 23: Instance and Panoptic Segmentation (Mask R-CNN, Panoptic FPN, SOLO, PointRend) Chapter 24: Pose Estimation (2D/3D human pose, OpenPose, DensePose, animal pose) Chapter 25: Optical Character Recognition (OCR) (Tesseract, CRNN, EAST, Transformer-based OCR) Chapter 26: Image Retrieval (Content-based retrieval, hashing, Siamese networks) Chapter 27: Face Recognition and Metric Learning (FaceNet, ArcFace, CosFace, triplet loss, sphereface) Chapter 28: Scene Understanding (Scene classification, object relationships, layout estimation) Chapter 29: Anomaly Detection (One-class SVM, autoencoders, reconstruction-based methods)