The Grand AI Handbook

LLM Prompting

Survey prompting strategies for optimizing large language model performance.

This section surveys various prompting strategies designed to elicit optimal performance from Large Language Models (LLMs). Prompting is the art and science of crafting effective inputs (prompts) to guide LLMs toward desired outputs. As LLMs have grown in capability, sophisticated prompting techniques have evolved beyond simple instructions, enabling complex reasoning, interaction, and refinement. Effective **prompt engineering** is crucial for harnessing the full potential of these models across diverse tasks, from simple Q&A to multi-step problem-solving and agentic behavior.

Basic Prompting Strategies

These foundational techniques rely on the inherent pattern-completion abilities of LLMs learned during pre-training.

Zero-Shot Prompting

Zero-shot prompting involves directly asking the LLM to perform a task it hasn't been explicitly trained for, without providing any examples. The prompt typically includes the instruction or question. This relies entirely on the model's pre-trained knowledge and ability to generalize.

Example: "Translate the following English text to French: 'Hello, world!'"

Few-Shot Prompting

Few-shot prompting provides the LLM with a small number (typically 1 to 5) of examples (shots) demonstrating the desired input-output format or task execution within the prompt itself. This helps the model understand the task context and expected output style better than zero-shot prompting.

Example: Instruction: Provide the antonym. Input: happy Output: sad Input: fast Output: slow Input: tall Output: short Input: hot Output: cold

Enhancing Reasoning and Decomposition

These advanced techniques guide the LLM through more complex problem-solving processes, often breaking down tasks or encouraging step-by-step thinking.

Chain-of-Thought (CoT)

Chain-of-Thought prompting encourages the LLM to generate intermediate reasoning steps before providing the final answer, mimicking a human thought process. This is typically achieved by including few-shot examples where the reasoning steps are explicitly shown. It significantly improves performance on tasks requiring arithmetic, commonsense, or symbolic reasoning.

Example Prompt Snippet: "... Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now? A: Roger started with 5 balls. 2 cans of 3 tennis balls each is 2 * 3 = 6 balls. So he has 5 + 6 = 11 balls. The answer is 11."

Self-Consistency

Self-Consistency improves upon CoT by sampling multiple diverse reasoning paths (chains of thought) for the same prompt and then selecting the most consistent answer through a majority vote. This makes the final output more robust and less sensitive to any single flawed reasoning path.

Least-to-Most Prompting

This strategy involves breaking down a complex problem into a sequence of simpler subproblems. The LLM is prompted to solve each subproblem sequentially, often using the solution of the previous step as input for the next. This guides the model from easier steps to the final complex solution, improving performance on tasks requiring multi-step procedures.

Task Decomposition

Similar to Least-to-Most, task decomposition is the general principle of breaking a large task into smaller, manageable steps or sub-tasks. This can be implemented through specific prompting strategies like Least-to-Most or by designing multi-turn interactions where the LLM focuses on one part of the problem at a time. It's a core concept in effective prompt engineering for complex problems.

Tree of Thoughts (ToT)

Tree of Thoughts extends CoT by enabling the LLM to explore multiple reasoning paths simultaneously, forming a tree structure. At each step, the model can generate several possible intermediate thoughts or steps. These paths are evaluated (e.g., using the LLM itself or heuristics), and promising paths are explored further, allowing for backtracking and exploration of alternatives, akin to planning or search algorithms. This is particularly useful for tasks where exploration or creative problem-solving is needed.

Program of Thoughts (PoT)

Program of Thoughts leverages the LLM's coding capabilities by prompting it to generate code (e.g., Python) as the reasoning steps. The generated code can be executed by an interpreter to compute the final answer. This offloads complex calculations (like arithmetic or symbolic manipulation) to the reliable code execution environment, improving accuracy and robustness for quantitative reasoning tasks.

Interactive and Agentic Prompting

These techniques involve more dynamic interaction, potentially using external tools or iterative refinement cycles.

ReAct (Reason + Act)

ReAct prompting enables LLMs to act as agents by interleaving reasoning steps (thought) with actions (e.g., using external tools like search engines or calculators). The model generates a thought about what to do next, then an action to take, receives an observation from the environment or tool, and repeats the cycle. This allows LLMs to gather external information or perform computations to solve tasks requiring up-to-date knowledge or precise calculations.

Self-Refine

Self-Refine is an iterative approach where the LLM generates an initial output, then critiques its own output based on given criteria or goals, and finally refines the output based on the critique. This generation-critique-refinement loop can be repeated multiple times, leading to higher-quality outputs by allowing the model to correct its own mistakes or improve clarity and coherence.

Addressing Specific Challenges

Research also focuses on understanding and mitigating limitations through prompting or analysis.

Measuring and Narrowing the Compositionality Gap

Compositionality refers to the ability to understand and produce novel combinations of familiar concepts. While LLMs excel at many tasks, they sometimes struggle with complex compositional instructions (e.g., "Find the largest city in the country whose capital is Paris"). Research in this area focuses on designing benchmarks to measure this gap and developing prompting strategies (often involving decomposition or structured reasoning like CoT or Least-to-Most) to improve compositional generalization.

Key Takeaways

  • Prompting is fundamental to interacting with and controlling LLMs.
  • Basic techniques like zero-shot and few-shot prompting leverage the model's pre-trained knowledge.
  • Advanced strategies like Chain-of-Thought, Self-Consistency, Least-to-Most, Tree of Thoughts, and Program of Thoughts enhance complex reasoning and problem-solving.
  • Agentic approaches like ReAct enable LLMs to interact with external tools and environments.
  • Iterative methods like Self-Refine allow models to improve their own outputs.
  • Ongoing research addresses challenges like compositionality through better prompting and evaluation.
  • Effective prompt engineering combines these techniques to optimize LLM performance for specific tasks.