Stars
Custom components for Haystack for creating embeddings and reranking documents using the VoyageAI Models.
Advanced RAG pipeline optimization framework using DSPy. Implements modular RAG pipelines with Query-Rewriting, Sub-Query Decomposition, and Hybrid Search via Weaviate. Automates prompt tuning and …
Pairwise Ranking Prompting (PRP): Zero-shot LLM reranking library implementing efficient pairwise strategies (Heapsort, Sliding Window, All-Pairs). Mitigates position bias via bidirectional compari…
Self-Reflective Question Answering for Biomedical Reasoning
Training code for advanced RAG techniques - Adaptive-RAG, Corrective RAG, RQ-RAG, Self-RAG, Agentic RAG, and ReZero. Reproduces paper methodologies to fine-tune LLMs via SFT and GRPO for adaptive r…
Modular LLM ranking library for Information Retrieval and RAG. Implements state-of-the-art Pairwise, Setwise, and Listwise ranking with structured generation and specialized models (RankZephyr, Ran…
Pipelines for Fine-Tuning LLMs using SFT and RLHF
Production-ready Haystack/LangChain pipelines for Hybrid & Parent-Child Retrieval, Diversity Filtering, MMR, Metadata Filtering, Reranking, Query Enhancement, Multi-Tenancy, Agentic RAG across Pine…
Dataloaders is a versatile library designed for processing and formatting datasets to support various Retrieval-Augmented Generation (RAG) pipelines, facilitating efficient evaluation and analysis.
Effect of Optimizer Selection and Hyperparameter Tuning on Training Efficiency and LLM Performance
LLM-Blender: Ensembling framework that maximizes LLM performance via pairwise ranking. Employs PairRanker to rank candidates and GenFuser to merge outputs, generating superior responses by combinin…
Performance Evaluation of Rankers and RRF Techniques for Retrieval Pipelines: Employs Diversity, Lost-in-the-Middle, and Similarity rankers to reorder documents and maximize LLM context window perf…
MTEB: Massive Text Embedding Benchmark
Automatically generate comprehensive Pull Request descriptions with LLMs
Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing…
Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).
🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
scikit-learn: machine learning in Python