Paraprompt logoParaprompt

Prompt technique

Retrieval-Augmented Generation (RAG)

Ground the model in your own data — docs, tickets, code, anything searchable.

What it is

Retrieval-Augmented Generation pairs an LLM with a retriever (vector database, BM25, web search) so the model answers using up-to-date, source-cited context rather than its frozen weights. RAG eliminates a huge class of hallucinations, lets you ship answers about private or recent data without fine-tuning, and gives you provenance for every claim.

When to use it

  • Q&A over private documents, codebases, support tickets
  • Facts that change after the model's training cutoff
  • Any answer that should cite its sources

Example

You are a support agent. Answer the user's question using only the context below. If the answer is not in the context, say "I don't know based on the available docs."

Context:
{retrieved_chunks}

Question: {user_question}

Answer with citations like [1], [2] mapped to the chunk index.

Why it works: Strict grounding ("using only the context"), an explicit fallback, and citation formatting — the three rules that make RAG reliable.

Pitfalls

  • !Garbage retrieval = garbage answers. Tune chunking, embeddings and reranking before tweaking the prompt.
  • !Long contexts dilute attention — rerank to top-k 4–8 chunks rather than dumping everything.

Pairs well with

Open · free · community-built

Stop losing your best prompts.
Put them on the registry.

Join thousands of prompt engineers using Paraprompt as their canonical home for prompts across every model.