Klu raises $1.7M to empower AI Teams  

Chain of Thought Prompting

by Stephen M. Walker II, Co-Founder / CEO

What is Chain of Thought?

Chain of Thought (CoT) prompting is a technique used to enhance the reasoning capabilities of large language models (LLMs). It was introduced by Wei et al. in 2022 and involves guiding the LLM to think step by step by providing it with a few-shot exemplar that outlines the reasoning process. The model is then expected to follow a similar chain of thought when answering the prompt. This approach is particularly effective for complex tasks that require a series of reasoning steps.

CoT prompting works by prompting the model to produce intermediate reasoning steps before giving the final answer to a problem. The idea is that a model-generated chain of thought would mimic an intuitive thought process when solving a problem. This method does not require a large training dataset or modifying the language model's weights.

Experiments on three large language models show that CoT prompting improves performance on a range of arithmetic, commonsense, and symbolic reasoning tasks. For instance, prompting a 540B-parameter language model with just eight chain of thought exemplars achieves state-of-the-art results, surpassing even fine-tuned GPT-3 with a verifier.

There are also variants of CoT prompting, such as "Tree-of-Thought" and "Graph-of-Thought", which were inspired by the success of CoT prompting.

How does Chain of Thought Prompting work?

Chain of Thought (CoT) prompting is a method that improves the reasoning of large language models (LLMs) by guiding them through a structured thought process. This technique involves providing a model with an example that demonstrates how to approach a similar problem step by step. The model then applies this structured reasoning to new prompts, which is especially beneficial for complex tasks requiring arithmetic, commonsense, and symbolic reasoning.

CoT prompting comes in various forms, such as multimodal CoT, which combines text and visual inputs, and least to most prompting, where the most uncertain questions are prioritized for detailed human feedback. This feedback is then used to enhance the model's responses to new questions.

A notable variant is zero-shot CoT, which adds the phrase "Let's think step by step" to a prompt, aiding the model when few examples are available. CoT prompting is most effective with models that have around 100 billion parameters or more, as smaller models may not generate coherent reasoning sequences, leading to less accurate results. The effectiveness of CoT prompting scales with the size of the model, with larger models exhibiting more substantial improvements in performance.

Understanding Chain-of-Thought Prompting

CoT prompting is particularly effective for complex tasks that require a series of reasoning steps before a response can be generated. It has been found to significantly improve the model's performance on tasks that require arithmetic, commonsense, and symbolic reasoning.

For instance, consider a task where the model is asked to determine whether the odd numbers in a group add up to an even number. A CoT prompt would guide the model to first identify the odd numbers, then add them up, and finally determine whether the sum is even or odd.

Zero-Shot CoT Prompting

CoT prompting can also be used in a zero-shot setting. This involves adding a phrase like "Let's think step by step" to the original prompt, which can also be used alongside few-shot prompting. This simple addition has been found to be effective at improving the model's performance on tasks where there are not many examples to use in the prompt.

Automatic Chain-of-Thought (Auto-CoT)

While CoT prompting can be effective, it often involves hand-crafting examples, which can be time-consuming and may lead to suboptimal solutions. To address this, researchers have proposed an approach known as Automatic Chain-of-Thought (Auto-CoT). This method leverages LLMs to generate reasoning chains for demonstrations automatically, thereby eliminating the need for manual effort.

Limitations and Future Research

Despite its advantages, CoT prompting has its limitations. For instance, smaller models have been found to produce illogical chains of thought, leading to lower accuracy than standard prompting. Future research will likely focus on refining this technique and exploring ways to make it more effective across a wider range of tasks and model sizes.

What is Chain of Thought reasoning and why does it work?

Chain of thought or reasoning is a sequential process of understanding or decision-making that connects the ideas or arguments in a structured manner. It begins with an initial thought, leading to a series of logically connected ideas, and ends with a final conclusion. This reasoning process includes analysis, evaluation, and synthesis of information, and it is fundamental to problem-solving, decision-making, and critical thinking. The strength of the chain of thought depends on the quality and relevance of each link within the chain. Frequently, visual tools like flowcharts or diagrams are used to illustrate this chain of thought for better understanding.

Chain of Thought (CoT) reasoning enhances large language models' (LLMs) problem-solving by prompting them to detail intermediate steps logically before concluding. This method, akin to human problem-solving, breaks down complex tasks into smaller segments and has proven to boost LLMs' performance in arithmetic, commonsense, and symbolic reasoning tasks.

In practice, CoT prompting provides LLMs with an exemplar of the reasoning process for a similar problem, which the model then emulates for new prompts. While CoT prompting generally improves accuracy over standard prompting, its effectiveness is more pronounced in models with 100 billion parameters or more. Conversely, smaller models may generate less logical reasoning sequences, leading to reduced accuracy. The performance improvements of CoT prompting are more substantial in larger models.


Chain-of-Thought prompting represents a significant advancement in the field of artificial intelligence, particularly in enhancing the reasoning capabilities of Large Language Models. By encouraging these models to explain their reasoning process, CoT prompting has shown promise in improving performance on complex tasks. While the technique has its limitations, it opens up exciting possibilities for the future of LLMs.

More terms

Abductive Reasoning

Abductive reasoning is a form of logical inference that focuses on forming the most likely conclusions based on the available information. It was popularized by American philosopher Charles Sanders Peirce in the late 19th century. Unlike deductive reasoning, which guarantees a true conclusion if the premises are true, abductive reasoning only yields a plausible conclusion but does not definitively verify it. This is because the information available may not be complete, and therefore, there is no guarantee that the conclusion reached is the right one.

Read more

What is neuro-fuzzy?

Neuro-fuzzy refers to the combination of artificial neural networks and fuzzy logic in the field of artificial intelligence. This hybridization results in a system that incorporates human-like reasoning, and is often referred to as a fuzzy neural network (FNN) or neuro-fuzzy system (NFS).

Read more

It's time to build

Collaborate with your team on reliable Generative AI features.
Want expert guidance? Book a 1:1 onboarding session from your dashboard.

Start for free