Reasoning & CoT Technique

Selection-Inference (SI)

Logical reasoning fails when models try to do everything at once — selecting relevant information AND drawing conclusions in a single step. Selection-Inference separates these into two alternating modules: one that identifies the right premises, and one that draws valid inferences from them.

Technique Context: 2023

Introduced: Selection-Inference was published at ICLR 2023 by Creswell et al. The technique addresses a core weakness in LLM reasoning: when given a set of facts and asked to draw a conclusion, models often select irrelevant premises or make invalid inferential leaps. SI introduces a two-module architecture that alternates between a Selection module (which identifies the most relevant premises for the next reasoning step) and an Inference module (which draws a single valid conclusion from the selected premises). This disciplined alternation achieved over 100% improvement on logical reasoning benchmarks.

Modern LLM Status: The Selection-Inference pattern has become increasingly relevant as models are used for complex logical reasoning in legal, scientific, and financial domains. While modern frontier models have improved at reasoning, they still benefit from the explicit separation of premise-selection from inference. The technique is particularly valuable when working with large knowledge bases where selecting the right information is as important as reasoning correctly from it. In production systems, the two-module pattern maps naturally to retrieval-augmented generation (RAG) architectures.

The Core Insight

Separate Selection from Inference

Standard prompting gives a model all available information and asks for a conclusion — a cognitive overload that leads to errors. Selection-Inference breaks reasoning into two strictly separated operations.

The Selection module looks at all available facts and picks only the ones relevant to the current reasoning step. The Inference module takes those selected facts and draws exactly one valid conclusion. Then the cycle repeats: the new conclusion becomes part of the available facts, and Selection picks the next relevant set. This alternation continues until the final answer is reached.

Think of it like a legal team where one person gathers evidence and another person argues the case — each expert focuses on what they do best, and the quality of the overall argument improves because neither is distracted by the other’s job.

Why Separation of Concerns Matters in Reasoning

When humans solve logic puzzles, they don’t try to use all information simultaneously. They identify relevant clues, draw a small inference, then look for more relevant clues. Selection-Inference formalizes this natural pattern. By forcing the model to explicitly state which premises it’s using before drawing any conclusion, errors become visible: either the wrong premises were selected, or the inference from correct premises was invalid. This decomposition turns opaque reasoning into a debuggable pipeline.

The Selection-Inference Process

Five stages from knowledge base to auditable conclusion

1

Present the Knowledge Base

Provide the model with all available facts, rules, and premises. These could come from a document, database, or prior reasoning steps.

Example

“Facts: (1) All mammals breathe air. (2) Whales are mammals. (3) Fish breathe through gills. (4) Whales live in water. (5) Animals that live in water and breathe air must surface periodically.”

2

Selection Step

The Selection module examines all available information and identifies the minimal set of premises relevant to the current reasoning goal. It explicitly states which facts it’s choosing and why.

Example

Selected premises: Fact (1) “All mammals breathe air” and Fact (2) “Whales are mammals” — these are relevant because we need to determine how whales breathe.

3

Inference Step

The Inference module takes ONLY the selected premises and draws a single, valid conclusion. No additional information is used — just the explicitly selected facts.

Example

Inference: From facts (1) and (2), we conclude: “Whales breathe air.”

4

Update and Iterate

The new conclusion is added to the available knowledge base. The process returns to the Selection step, now with an enriched set of facts. This alternation continues until the reasoning goal is achieved.

Example

Updated knowledge base now includes Fact (6): “Whales breathe air.” Selection module now picks Facts (4), (5), and (6) for the next inference cycle.

5

Chain Completion

When the final conclusion answers the original question, the full chain of selection-inference pairs forms a complete, auditable reasoning trace.

Example

Final inference: From Facts (4), (5), and (6) — “Whales must surface periodically.” The complete chain shows every premise used and every conclusion drawn.

See the Difference

Why separating selection from inference produces more reliable reasoning

Direct Reasoning

Prompt

Given: (1) All mammals breathe air. (2) Whales are mammals. (3) Fish breathe through gills. (4) Whales live in water. (5) Animals that live in water and breathe air must surface periodically. Conclusion: Do whales need to surface?

Response

Yes, whales need to surface because they live in water but are mammals that breathe air.

Correct but skips explicit premise selection — how do we know which facts were actually used?
VS

Selection-Inference

Alternating Modules

Selection 1: Picks Facts (1) and (2) — relevant to determining how whales breathe.
Inference 1: “Whales breathe air.”

Selection 2: Picks new conclusion + Fact (4) + Fact (5) — relevant to surfacing behavior.
Inference 2: “Whales must surface periodically.”

Final Answer

Yes, whales must surface periodically. Reasoning trace: Facts (1)+(2) → “Whales breathe air” → combined with Facts (4)+(5) → “Whales must surface periodically.”

Every premise explicitly selected, every inference drawn from stated evidence

Natural Language Works Too

While structured frameworks and contextual labels are powerful tools, LLMs are exceptionally good at understanding natural language. As long as your prompt contains the actual contextual information needed to create, answer, or deliver the response you’re looking for — the who, what, why, and constraints — the AI can produce complete and accurate results whether you use a formal framework or plain conversational language. But even in 2026, with the best prompts, verifying AI output is always a necessary step.

Selection-Inference in Action

See how alternating selection and inference improves logical reasoning

Scenario

A client claims breach of contract. The knowledge base includes the contract terms, communications between parties, state contract law statutes, and relevant case precedents. The question: “Does the client have grounds for a breach of contract claim?”

Selection-Inference Chain

Selection 1: Contract Section 4.2 (delivery timeline) + Email from vendor dated March 15 (acknowledging delay).
Inference 1: The vendor acknowledged failing to meet the contractual delivery deadline.

Selection 2: Inference 1 + Contract Section 7.1 (remedies for late delivery) + State UCC §2-711 (buyer’s remedies).
Inference 2: Late delivery constitutes a material breach under both the contract terms and applicable state law.

Selection 3: Inference 2 + Client communications (no waiver of delivery timeline) + Precedent: Smith v. Allied Corp (similar facts, breach upheld).
Inference 3: The client did not waive the deadline and has grounds consistent with relevant precedent.

Final conclusion: The client has strong grounds for a breach of contract claim based on documented late delivery, applicable contract remedies, and supporting case law. Note: Always verify AI-generated legal analysis with qualified legal counsel.

Scenario

A patient presents with fatigue, joint pain, and a butterfly-shaped facial rash. The knowledge base includes symptom descriptions, lab results (elevated ANA, low complement levels), family history, and diagnostic criteria for several autoimmune conditions.

Selection-Inference Chain

Selection 1: Symptom: butterfly-shaped facial rash + Diagnostic criteria for systemic lupus erythematosus (SLE).
Inference 1: The malar (butterfly) rash is one of the 11 ACR classification criteria for SLE.

Selection 2: Inference 1 + Lab result: elevated ANA titer + Lab result: low complement C3/C4.
Inference 2: Positive ANA and low complement levels are two additional ACR criteria, bringing the total to 3 of 11.

Selection 3: Inference 2 + Symptom: joint pain (non-erosive arthritis) + ACR threshold (4 of 11 criteria needed for classification).
Inference 3: With 4 criteria met (malar rash, positive ANA, low complement, arthritis), the patient meets the ACR classification threshold for SLE.

Final conclusion: The evidence supports a preliminary classification of SLE based on 4 of 11 ACR criteria. Each criterion was independently selected and verified. Note: AI-assisted analysis should always be reviewed and confirmed by qualified medical professionals.

Scenario

Prove that the sum of two even numbers is always even. The knowledge base includes the definition of even numbers, properties of integer addition, and basic algebraic axioms.

Selection-Inference Chain

Selection 1: Definition: An even number can be expressed as 2k where k is an integer.
Inference 1: Let the two even numbers be 2a and 2b, where a and b are integers.

Selection 2: Inference 1 + Property: Integer addition is closed (sum of integers is an integer) + Distributive property of multiplication over addition.
Inference 2: 2a + 2b = 2(a + b), and since a + b is an integer (closure), 2(a + b) fits the form 2k.

Selection 3: Inference 2 + Definition of even numbers (2k form).
Inference 3: Since 2(a + b) is of the form 2k where k = (a + b) is an integer, the sum is even by definition.

Final conclusion: The sum of two even numbers is always even. QED. Every step explicitly cites the axiom or definition used. Note: Verify mathematical proofs independently before relying on AI-generated reasoning.

When to Use Selection-Inference

Best for logical reasoning tasks requiring explicit evidence trails

Perfect For

Complex Logical Reasoning

Tasks with many available premises where selecting the right evidence is as important as reasoning correctly from it.

Knowledge-Intensive Tasks

Working with large knowledge bases, document collections, or databases where precise evidence selection is critical.

Multi-Step Proofs

Building formal arguments, legal cases, or mathematical proofs where each step must cite specific supporting evidence.

Reasoning Auditability

Situations where you need to verify which evidence was used for each conclusion — compliance, safety-critical systems, or regulated industries.

Skip It When

Simple Questions

When the relevant information is obvious and doesn’t need explicit selection — the overhead of alternating modules adds no value.

Creative or Generative Tasks

Writing, brainstorming, or open-ended generation where there are no premises to select from — SI is designed for logical reasoning.

Speed-Critical Applications

When latency matters more than auditability — alternating between selection and inference modules adds processing time at each step.

Use Cases

Where Selection-Inference delivers the most value

Legal Document Analysis

Select relevant clauses, statutes, and precedents from large legal corpora, then draw precise legal conclusions grounded in explicitly cited evidence.

Scientific Literature Review

Systematically select relevant findings from research papers and draw evidence-based conclusions with clear citation trails.

Medical Diagnostic Reasoning

Alternate between selecting relevant symptoms and test results and inferring possible diagnoses through structured differential analysis.

Financial Audit Trails

Select relevant financial records and regulations at each step, building auditable inference chains for compliance verification.

Compliance Verification

Map regulatory requirements to specific organizational practices by selecting applicable rules and inferring compliance status at each checkpoint.

Academic Research Synthesis

Build systematic literature reviews by selecting relevant findings from multiple sources and drawing synthesized conclusions with traceable evidence.

Where Selection-Inference Fits

SI separates what was mixed in earlier reasoning approaches

Chain-of-Thought Mixed Selection + Inference Reasoning as continuous prose
Selection-Inference Separated Modules Alternating selection and inference steps
Decomposed Prompting Specialized Handlers Dedicated handlers per sub-problem
LATS Tree Search + Module Routing Dynamic search with specialized modules
Apply SI to RAG Pipelines

Selection-Inference maps perfectly to retrieval-augmented generation (RAG) architectures. The Selection module corresponds to the retrieval step (finding relevant documents), while the Inference module corresponds to the generation step (reasoning from retrieved context). Making this mapping explicit can improve RAG quality by ensuring each inference is grounded in specifically cited retrieved passages.

Separate to Strengthen

Apply the selection-inference pattern to your reasoning tasks or explore other structured techniques.