A technique where a model expands a single user query into multiple concurrent sub-queries to retrieve broader context before generating an answer.
More about Query Fan-Out
Query Fan-Out is a retrieval technique where a single user query is expanded into multiple concurrent sub-queries, each of which retrieves a different slice of context. The results are merged before the model generates its answer. The goal is to bring back a richer, more diverse set of documents than a single search would.
For example, a user asking "compare the top three customer support chatbots for SaaS" might be fanned out into separate searches for each product name, each "best of" listicle, pricing pages, and recent reviews. The model then synthesizes a single answer from all the retrieved material.
Query fan-out is used by Google AI Mode, Perplexity, and increasingly in agentic RAG pipelines, particularly in multi-turn dialogue with retrieval. It is also useful inside production chatbots where a single user message can imply several distinct information needs.