> ## Documentation Index
> Fetch the complete documentation index at: https://docs.learningcommons.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Introduction

> Learn what evaluators are and how they help measure the quality of AI-generated educational materials through pedagogical alignment assessment.

## What evaluators do

Evaluators assess the quality of AI-generated educational content by:

* Measuring key dimensions of text for pedagogical alignment
* Identifying areas for improvement

Evaluators help edtech developers reliably assess their LLM outputs and build evidence-based tools that reinforce student learning and whole child development.

## When to use evaluators

Whether you're testing, refining, or scaling, evaluators help you do it better and faster. Here are four ways you can use them effectively.

| Use case                    | Examples                                                                                                                                                                                                             | Implementation                                                                                                                                                                                                                                                                                                               |
| --------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Optimize your product       | You are building a vocabulary-focused feature – you want higher vocabulary difficulty and simpler sentence structure.<br /><br />You are creating read-aloud support and want to deprioritize vocabulary complexity. | Set targets for vocabulary and sentence structure against grade-level appropriateness. Run the [Sentence Structure Evaluator](/evaluators/literacy-evaluators/sentence-structure) and [Vocabulary Evaluator](/evaluators/literacy-evaluators/vocabulary) on your LLM outputs to confirm that they stay in acceptable ranges. |
| Monitor consistency         | Your AI output starts to vary unexpectedly after model drift or small system updates.                                                                                                                                | Run regular regression tests on your LLM outputs and compare scores over time to ensure stable behavior.                                                                                                                                                                                                                     |
| Select the right model      | You need to compare new models on quality, speed, and cost before switching.                                                                                                                                         | Create a *gold set* with expected scores for key parameters (e.g., grade level, topic, text type). Use evaluators as a standardized benchmark to monitor drift from your baseline.                                                                                                                                           |
| Build trust with your users | Districts and educators ask for evidence that your AI-generated content is high-quality and aligned with learning principles.                                                                                        | Share your evaluation process and results so stakeholders can see the rigor behind your system and trust that your outputs remain consistent and research-aligned.                                                                                                                                                           |

## Our approach

Learning Commons collaborates closely with pedagogical experts to define, test, and build our evaluators.

We follow a research-informed process to develop evaluators that are firmly anchored in learning science:

<Frame>
  <img
    src="https://mintcdn.com/czi-60a2a443/UtkD6p7UkcR_E40q/images/evaluators/our-approach-to-building-evaluators.svg?fit=max&auto=format&n=UtkD6p7UkcR_E40q&q=85&s=875da5467c3f0007dadd426fcd4e45be"
    alt="Diagram showing the AIDT process for designing and validating
evaluators"
    width="2276"
    height="583"
    data-path="images/evaluators/our-approach-to-building-evaluators.svg"
  />
</Frame>

* We build alongside experts in learning science and rubric development (e.g. [Student Achievement Partners](https://learnwithsap.org/) ↗, [CAST](https://www.cast.org/) ↗, and [Achievement Network (ANet)](https://www.achievementnetwork.org/) ↗)
* We translate expert insight into ground-truth datasets that reflect real teaching and learning principles.
* We develop, validate, and ship software that evaluates text the way an expert would.
