Link Search Menu Expand Document
AI Alliance Banner
Join Our Initiative   GitHub Repo

User Personae and Their Needs

User Personae for Evaluation

There are a wide range of stakeholders in the AI space who can benefit from this initiative:

  • Model Builders: Who need to evaluate their models against desired criteria.
  • Independent Software Vendors: Companies providing AI-as-a-Service, including evaluations for safety.
  • AI Application Developers: Builders of AI-enabled applications who need to choose the most effective (or cost-effective) models for their needs. They also need to perform appropriate safety evaluation of their solutions.
  • Researchers: Exploring new algorithms and datasets for evaluation.

Shared Needs for All Users

Collectively, these users would benefit from the following capabilities:

  • The ability to easily share pre-executed benchmark results, to compare them with other benchmarks available, and optionally to focus on domain-specific benchmarks, e.g., for industries such as healthcare or finance.
  • Share datasets and evaluators in a reusable manner.
  • Easily execute evaluations on select models, in public leaderboards or private deployments.
  • Publish evaluation results in a leaderboard.
  • Share knowhow and best practices in an actionable way.
  • Adopt a reference stack of tools that facilitates the above capabilities.

Child Pages