Running evaluators

What you need
What you’ll do
STEP 1: Choose the content to evaluate
STEP 2: Run the evaluator
STEP 3: Review the results

All evaluators follow this workflow, even though their inputs and outputs may differ. For evaluator-specific setup, inputs, prompts, and interpretation, see the documentation for the evaluator you want to run.

What you need

Before you begin, make sure you have:

API key from the model provider.
Python workspace, the Evaluators Playground, or the appropriate SDK.
The text you want to evaluate.
Required context (if applicable): Inputs such as grade level or intended audience.

What you’ll do

STEP 1: Choose the content to evaluate

Select the content you want to evaluate. Make sure the content:
- Matches the evaluator’s intended content type (e.g., informational text or conversation output)
- Falls within documented length and format constraints
- Does not include personal or sensitive data
Prepare the inputs required by the evaluator. Refer to the evaluator’s page for this info. This usually includes:
- The content to evaluate

Any required contextual parameters, like the intended grade level.

STEP 2: Run the evaluator

Run the evaluator using the:
- Provided prompts
- Recommended LLM model mentioned in the evaluator’s documentation page.

We recommend running the prompt 3 times and aggregating the results using a simple majority rule to improve accuracy when creating or validating your prompts, and running once when you are building our prompt into your production code.

STEP 3: Review the results

Review the evaluator output based on the interpretation guidelines on the evaluator’s documentation page.

Telemetry

Accuracy

⌘I

Understanding evaluators

Getting started

Using evaluators

Literacy evaluators

Datasets

Resources

What you need

What you’ll do

STEP 1: Choose the content to evaluate

STEP 2: Run the evaluator

STEP 3: Review the results

Understanding evaluators

Getting started

Using evaluators

Literacy evaluators

Datasets

Resources

​What you need

​What you’ll do

​STEP 1: Choose the content to evaluate

​STEP 2: Run the evaluator

​STEP 3: Review the results

What you need

What you’ll do

STEP 1: Choose the content to evaluate

STEP 2: Run the evaluator

STEP 3: Review the results