2025-01-23 RAG Pipelines
2 min read
graph TD
Start["Start"]
QueryClassification["Query Classification<br>Score: +0.015<br>Latency: -4.83s"]
Retrieval["Retrieval (Hybrid + HyDE)<br>Score: 0.58<br>Latency: 11.71s"]
Chunking["Chunking (Sliding Window)<br>Faithfulness: 97.41%"]
Embedding["Embedding Models<br>(LLM-Embedder: Optimal)"]
VectorDB["Vector Database<br>(Milvus: Scalable)"]
Reranking["Reranking (monoT5)<br>Score: 0.478<br>Latency: 11.71s"]
Repacking["Repacking (Reverse)<br>Score: 0.560"]
Summarization["Summarization (Recomp)<br>F1: 32.85<br>Latency: 11.70s"]
Generator["Generator Fine-Tuning<br>(Mixed Context: Robust)"]
Output["Final Output"]
Start --> QueryClassification
QueryClassification --> Retrieval
Retrieval --> Chunking
Chunking --> Embedding
Embedding --> VectorDB
VectorDB --> Reranking
Reranking --> Repacking
Repacking --> Summarization
Summarization --> Generator
Generator --> Output
graph TD
Q[Query] --> C{Query Classification}
C -->|No Retrieval Needed| G[Direct Generation]
C -->|Retrieval Needed| R{Retrieval Methods}
R -->|BM25| R1[Sparse Retrieval]
R -->|Contriever| R2[Dense Retrieval]
R -->|HyDE| R3[Hypothetical Doc]
R -->|Hybrid Search| R4[Hybrid]
R -->|HyDE + Hybrid| R5[Combined]
subgraph "Performance Metrics"
R1 -->|"mAP: 30.13, τ: 0.07s"| RR
R2 -->|"mAP: 23.99, τ: 3.06s"| RR
R3 -->|"mAP: 50.87, τ: 7.21s"| RR
R4 -->|"mAP: 47.14, τ: 3.20s"| RR
R5 -->|"mAP: 52.13, τ: 11.16s"| RR
end
RR{Reranking} -->|"monoT5 (τ: 4.5s)"| RR1[Deep Reranking]
RR -->|"monoBERT (τ: 15.8s)"| RR2[BERT Reranking]
RR -->|"RankLLaMA (τ: 82.4s)"| RR3[LLaMA Reranking]
RR -->|"TILDEv2 (τ: 0.02s)"| RR4[TILDE Reranking]
subgraph "Repacking Strategies"
RR1 --> P1[Forward]
RR1 --> P2[Reverse]
RR1 --> P3[Sides]
end
subgraph "Summarization"
P1 --> S1[Recomp]
P2 --> S2[LongLLMLingua]
P3 --> S3[SelectiveContext]
end
S1 -->|"F1: 32.85, τ: 11.70s"| FG[Final Generation]
S2 -->|"F1: 28.29, τ: 16.17s"| FG
S3 -->|"F1: 31.24, τ: 11.26s"| FG