This diagram provides a comprehensive overview of an end-to-end Retrieval-Augmented Generation (RAG) system — ranging from basic components to more advanced techniques. It is structured into five key stages:
- Converts natural language questions into structured queries for different databases.
- Supports:
- Text-to-SQL for Relational DBs
- Text-to-Cypher for Graph DBs
- Self-query Retriever for Vector DBs
- Enhances retrieval performance by transforming the question using:
- Multi-query
- RAG-Fusion
- Decomposition
- Step-back
- HyDE
- Dynamically determines the optimal retrieval path:
- Logical routing – chooses the appropriate database.
- Semantic routing – selects the best prompt based on similarity.
- Prepares and structures data for efficient retrieval using:
- Semantic Splitters – optimize chunking.
- Dense Representations – compact retrieval units.
- Fine-tuned Embeddings – specialized models like CoLBERT.
- Hierarchical Indexing – tree-based summaries (e.g., RAPTOR).
- Uses relevance-based document filtering and response generation:
- Ranking via Re-Rank, RankGPT, RAG-Fusion.
- Refinement and Active Retrieval through CRAG.
- Self-RAG and RRR for high-quality, context-aware answer generation.
This visual and modular breakdown helps developers and researchers understand the building blocks of a modern RAG pipeline — from query routing to advanced re-ranking and retrieval-based generation techniques.