LLM Prompting
Survey prompting strategies for optimizing large language model performance.
Basic Prompting Strategies
These foundational techniques rely on the inherent pattern-completion abilities of LLMs learned during pre-training.
Zero-Shot Prompting
Zero-shot prompting involves directly asking the LLM to perform a task it hasn't been explicitly trained for, without providing any examples. The prompt typically includes the instruction or question. This relies entirely on the model's pre-trained knowledge and ability to generalize.
Example: "Translate the following English text to French: 'Hello, world!'"
Few-Shot Prompting
Few-shot prompting provides the LLM with a small number (typically 1 to 5) of examples (shots) demonstrating the desired input-output format or task execution within the prompt itself. This helps the model understand the task context and expected output style better than zero-shot prompting.
Example: Instruction: Provide the antonym. Input: happy Output: sad Input: fast Output: slow Input: tall Output: short Input: hot Output: cold
Key Resources for Basic Prompting
- Paper (Introduced Zero/Few-Shot): Language Models are Few-Shot Learners (GPT-3 Paper) by Brown et al. (2020)
- Guide: Prompting Guide: Zero-Shot Prompting
- Guide: Prompting Guide: Few-Shot Prompting
- Course: ChatGPT Prompt Engineering for Developers by DeepLearning.AI & OpenAI
- Blog Post: OpenAI API Introduction (Discusses prompt design)
Enhancing Reasoning and Decomposition
These advanced techniques guide the LLM through more complex problem-solving processes, often breaking down tasks or encouraging step-by-step thinking.
Chain-of-Thought (CoT)
Chain-of-Thought prompting encourages the LLM to generate intermediate reasoning steps before providing the final answer, mimicking a human thought process. This is typically achieved by including few-shot examples where the reasoning steps are explicitly shown. It significantly improves performance on tasks requiring arithmetic, commonsense, or symbolic reasoning.
Example Prompt Snippet: "... Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now? A: Roger started with 5 balls. 2 cans of 3 tennis balls each is 2 * 3 = 6 balls. So he has 5 + 6 = 11 balls. The answer is 11."
Self-Consistency
Self-Consistency improves upon CoT by sampling multiple diverse reasoning paths (chains of thought) for the same prompt and then selecting the most consistent answer through a majority vote. This makes the final output more robust and less sensitive to any single flawed reasoning path.
Least-to-Most Prompting
This strategy involves breaking down a complex problem into a sequence of simpler subproblems. The LLM is prompted to solve each subproblem sequentially, often using the solution of the previous step as input for the next. This guides the model from easier steps to the final complex solution, improving performance on tasks requiring multi-step procedures.
Task Decomposition
Similar to Least-to-Most, task decomposition is the general principle of breaking a large task into smaller, manageable steps or sub-tasks. This can be implemented through specific prompting strategies like Least-to-Most or by designing multi-turn interactions where the LLM focuses on one part of the problem at a time. It's a core concept in effective prompt engineering for complex problems.
Tree of Thoughts (ToT)
Tree of Thoughts extends CoT by enabling the LLM to explore multiple reasoning paths simultaneously, forming a tree structure. At each step, the model can generate several possible intermediate thoughts or steps. These paths are evaluated (e.g., using the LLM itself or heuristics), and promising paths are explored further, allowing for backtracking and exploration of alternatives, akin to planning or search algorithms. This is particularly useful for tasks where exploration or creative problem-solving is needed.
Program of Thoughts (PoT)
Program of Thoughts leverages the LLM's coding capabilities by prompting it to generate code (e.g., Python) as the reasoning steps. The generated code can be executed by an interpreter to compute the final answer. This offloads complex calculations (like arithmetic or symbolic manipulation) to the reliable code execution environment, improving accuracy and robustness for quantitative reasoning tasks.
Key Resources for Reasoning & Decomposition
- Paper (CoT): Chain-of-Thought Prompting Elicits Reasoning in Large Language Models by Wei et al. (2022)
- Paper (Self-Consistency): Self-Consistency Improves Chain of Thought Reasoning in Language Models by Wang et al. (2022)
- Paper (Least-to-Most): Least-to-Most Prompting Enables Complex Reasoning in Large Language Models by Zhou et al. (2022)
- Paper (ToT): Tree of Thoughts: Deliberate Problem Solving with Large Language Models by Yao et al. (2023)
- Paper (PoT): Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks by Chen et al. (2022)
- Guide: Prompting Guide: Chain-of-Thought
- Blog Post: Google AI Blog on CoT and Reasoning
- Blog Post: Tree of Thoughts Project Page
Interactive and Agentic Prompting
These techniques involve more dynamic interaction, potentially using external tools or iterative refinement cycles.
ReAct (Reason + Act)
ReAct prompting enables LLMs to act as agents by interleaving reasoning steps (thought) with actions (e.g., using external tools like search engines or calculators). The model generates a thought about what to do next, then an action to take, receives an observation from the environment or tool, and repeats the cycle. This allows LLMs to gather external information or perform computations to solve tasks requiring up-to-date knowledge or precise calculations.
Self-Refine
Self-Refine is an iterative approach where the LLM generates an initial output, then critiques its own output based on given criteria or goals, and finally refines the output based on the critique. This generation-critique-refinement loop can be repeated multiple times, leading to higher-quality outputs by allowing the model to correct its own mistakes or improve clarity and coherence.
Key Resources for Interactive & Agentic Prompting
- Paper (ReAct): ReAct: Synergizing Reasoning and Acting in Language Models by Yao et al. (2022)
- Paper (Self-Refine): Self-Refine: Iterative Refinement with Self-Feedback by Madaan et al. (2023)
- Guide: Prompting Guide: ReAct
- Blog Post: Google AI Blog on ReAct
- Project Page: Self-Refine Project Page
Addressing Specific Challenges
Research also focuses on understanding and mitigating limitations through prompting or analysis.
Measuring and Narrowing the Compositionality Gap
Compositionality refers to the ability to understand and produce novel combinations of familiar concepts. While LLMs excel at many tasks, they sometimes struggle with complex compositional instructions (e.g., "Find the largest city in the country whose capital is Paris"). Research in this area focuses on designing benchmarks to measure this gap and developing prompting strategies (often involving decomposition or structured reasoning like CoT or Least-to-Most) to improve compositional generalization.
Key Resources for Compositionality
- Paper (Example Study): Measuring and Narrowing the Compositionality Gap in Language Models by Press et al. (2022)
- Workshop/Survey: Resources on Compositional Generalization (Often includes relevant papers/benchmarks)
- Blog Post: Google Research Blog on Compositionality (Contextual)
Key Takeaways
- Prompting is fundamental to interacting with and controlling LLMs.
- Basic techniques like zero-shot and few-shot prompting leverage the model's pre-trained knowledge.
- Advanced strategies like Chain-of-Thought, Self-Consistency, Least-to-Most, Tree of Thoughts, and Program of Thoughts enhance complex reasoning and problem-solving.
- Agentic approaches like ReAct enable LLMs to interact with external tools and environments.
- Iterative methods like Self-Refine allow models to improve their own outputs.
- Ongoing research addresses challenges like compositionality through better prompting and evaluation.
- Effective prompt engineering combines these techniques to optimize LLM performance for specific tasks.