Tags: #benchmark

AI Research Agent
8.2k

MiroMindAI/MiroThinker

MiroThinker is an advanced AI research agent designed for complex research and prediction tasks, achieving state-of-the-art performance on various benchmarks.

LLM Evaluation Framework
Python
2.0k

tatsu-lab/alpaca_eval

An automatic, fast, and cost-effective evaluation framework for instruction-following language models, highly correlated with human judgments.

Benchmarking and Evaluation Framework
python
3.2k

embeddings-benchmark/mteb

MTEB is a comprehensive benchmark and evaluation framework designed to assess the performance of text embedding models and retrieval systems across a wide range of tasks.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.