The Grand AI Handbook

Generative AI Handbook

A comprehensive exploration of generative AI, from foundational concepts to cutting-edge techniques.

This handbook is inspired by Generative AI Handbook: A Roadmap for Learning Resources. All credit for the original content goes to the William Brown. I've updated certain sections, added additional resources, and restructured some content to enhance the learning experience while preserving the core educational value of the original work.

Note: This handbook is regularly updated to reflect the rapid developments in generative AI. Each section builds upon previous concepts, creating a cohesive learning path from basic principles to advanced applications.

Handbook Sections

Section I: Foundation: Statistical Prediction and ML

Goal: Recap machine learning basics + survey (non-DL) methods for tasks under the umbrella of "sequential prediction".

Read section →

Section II: Neural Networks and Deep Learning

Goal: Survey deep learning methods + applications to sequential and language modeling, up to basic Transformers.

Read section →

Section III: LLM Architecture and Training

Goal: Survey central topics related to training LLMs, with an emphasis on conceptual primitives.

Read section →

Section IV: Finetuning and Alignment

Goal: Survey techniques used for improving and "aligning" the quality of LLM outputs after pretraining.

Read section →

Section V: Applications and Interpretability

Goal: Survey how LLMs are used and evaluated in practice, beyond just "chatbots".

Read section →

Section VI: Inference Optimization

Goal: Survey architecture choices and lower-level techniques for improving resource utilization (time, compute, memory).

Read section →

Section VII: Addressing the Quadratic Scaling Problem

Goal: Survey approaches for avoiding the "quadratic scaling problem" faced by self-attention in Transformers.

Read section →

Section VIII: Beyond Transformers: Other Generative Models

Goal: Survey topics building towards generation of non-sequential content like images, from GANs to diffusion models.

Read section →

Section IX: Multimodal Models

Goal: Survey how models can use multiple modalities of input and output (text, audio, images) simultaneously.

Read section →

Learning Path

  • Begin with the foundational concepts in statistical prediction and machine learning
  • Progress through neural network approaches and transformer architectures
  • Explore specialized techniques for training, finetuning, and optimizing language models
  • Examine applications, evaluation methods, and performance optimizations
  • Discover advanced topics like sub-quadratic scaling, diffusion models, and multimodal approaches