A tool that measures the quality of materials generated by AI-powered educational applications. Evaluators assess specific aspects of content for pedagogical alignment and identify areas for improvement.
A structured framework used to evaluate a concept based on learning science. It provides consistent criteria and serves as the foundation for a family of evaluators.
The degree to which an evaluator’s score aligns with curated or human-annotated validation datasets. Expressed as a percentage, it reflects the evaluator’s reliability.
An evaluator made available early in its development because it offers useful capabilities for research and experimentation. While stable, it remains limited in scope and under active development. Early-release evaluators invite feedback to guide iterative improvement.
An objective measure of text difficulty based on features such as word length, sentence length, and syllable count (for example, the Flesch–Kincaid Grade Level).
The prior knowledge a student is expected to have that affects their ability to understand a text. This includes both curriculum-based knowledge and lived experience.