Link Search Menu Expand Document
AI Alliance Banner
Join Our Initiative   GitHub Repo

Leaderboards

This section describes the leaderboards we maintain with results from running benchmark suites of the evaluators against various models and AI systems that use them.

These leaderboards will include the leading open-source models to serve as evaluation targets and as evaluation judges. Initially, we are focusing on Meta’s Llama family of models and IBM’s Granite family of models, with others to follow.

Plans for Leaderboards

As we fill in the evaluation taxonomy, we will stand up more leaderboards for specific areas of the taxonomy with wide interest, organized into benchmarks.

A benchmark catalog will be provided to find and reuse these sets of evaluators.

The child pages listed next describe the implemented leaderboards.


Child Pages