mteb

Here are 15 public repositories matching this topic...

embeddings-benchmark / mteb

MTEB: Massive Text Embedding Benchmark

benchmark information-retrieval retrieval text-classification clustering sts semantic-search reranking text-embedding multimodal neural-search sentence-transformers sbert multilingual-nlp low-resource-nlp bitext-mining mteb

Updated Jan 17, 2026
Python

ContextualAI / gritlm

Star

Generative Representational Instruction Tuning

information-retrieval retrieval embeddings embedding-models embedding text-embedding sgpt grit sbert llm llms instruction-tuning mteb

Updated Jun 25, 2025
Jupyter Notebook

SeanLee97 / AnglE

Star

Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard

Updated Oct 19, 2025
Python

jina-ai / mlx-retrieval

Star

Train embedding and reranker models for retrieval tasks on Apple Silicon with MLX

embeddings mlx reranker apple-silicon mteb

Updated Sep 18, 2025
Python

su-park / mteb_ko_leaderboard

Star

한글 텍스트 임베딩 모델 리더보드

leaderboard korean embedding-models mteb

Updated Oct 22, 2024

worldbank / GISTEmbed

Star

GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings

deep-learning embedding-models sentence-embeddings fine-tuning huggingface sentence-transformers mteb

Updated Mar 6, 2024
Python

machinelearningZH / hybrid-search-eval

Star

A framework for benchmarking embedding models in hybrid search scenarios (BM25 + vector search) using Weaviate.

search evaluation embeddings embedding-models mean-reciprocal-rank sentence-transformers hybrid-search openrouter mteb hybridsearch

Updated Jan 17, 2026
Python

isaacus-dev / mleb

Star

The code used to evaluate embedding models on the Massive Legal Embedding Benchmark (MLEB).

law ai embeddings mteb isaacus

Updated Nov 7, 2025
Python

stanleylsx / text_embedding

Star

一个用于训练句子embedding的工具，支持Cosent以及Simcse、infonce

bge e5 piccolo simbert mteb m3e

Updated Jun 17, 2025
Python

devflowinc / openembeddings

Star

Self-hostable pay for what you use embedding server for bge-large-en and arbitrary embedding models using crypto

ethereum payments embeddings usdt wbtc usdc bge-large-en mteb

Updated Aug 25, 2023
JavaScript

fahmiaziz98 / unified-embedding-api

Star

A modular and open-source RAG-ready Embedding API supporting dense, sparse and Reranking Models. Easily configurable via config.yaml — no code changes required.

nlp api open-source embeddings rag sparse-embedding mteb dense-embeddings

Updated Nov 15, 2025
Python

louisbrulenaudet / tax-retrieval-benchmark

Sponsor

Star

An implementation of the TaxRetrievalBenchmark task for the 🤗 Massive Text Embedding Benchmark (MTEB) framework.

benchmark information-retrieval retrieval tax embeddings taxation semantic-search fiscal sentence-embeddings stp rag droit sentence-transformers sbert fiscalite retrieval-augmented-generation mteb

Updated Apr 29, 2025
Jupyter Notebook

Chitti is a retrieval-augmented-generation (RAG) application which utilizes a Mistral Large Language Model (LLM) for generation and a bge-m3 model developed by BAAI for retrieval. Chitti can help you answer questions about the IoT Summer Program, Projects in the program curriculum, Innovations of AIoT SMART Labs, Certification process and more!