MTEB: Massive Text Embedding Benchmark
-
Updated
Jan 17, 2026 - Python
MTEB: Massive Text Embedding Benchmark
Generative Representational Instruction Tuning
Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
Train embedding and reranker models for retrieval tasks on Apple Silicon with MLX
GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings
A framework for benchmarking embedding models in hybrid search scenarios (BM25 + vector search) using Weaviate.
The code used to evaluate embedding models on the Massive Legal Embedding Benchmark (MLEB).
Self-hostable pay for what you use embedding server for bge-large-en and arbitrary embedding models using crypto
A modular and open-source RAG-ready Embedding API supporting dense, sparse and Reranking Models. Easily configurable via config.yaml — no code changes required.
An implementation of the TaxRetrievalBenchmark task for the 🤗 Massive Text Embedding Benchmark (MTEB) framework.
Chitti is a retrieval-augmented-generation (RAG) application which utilizes a Mistral Large Language Model (LLM) for generation and a bge-m3 model developed by BAAI for retrieval. Chitti can help you answer questions about the IoT Summer Program, Projects in the program curriculum, Innovations of AIoT SMART Labs, Certification process and more!
A curated list of text embedding models, benchmarks, and tools for semantic search, retrieval, and classification.
🎯 Explore AngleLab, an end-to-end system for generating and refining ideas with clear structure and deterministic behavior.
Add a description, image, and links to the mteb topic page so that developers can more easily learn about it.
To associate your repository with the mteb topic, visit your repo's landing page and select "manage topics."