Early Release
This evaluator reflects early-stage work. We’re continuously improving its accuracy and reliability.
Requirements
To run this evaluator, you’ll need:- The system prompt & user prompt.
- The formatting instructions.
- The text to evaluate.
- A model and temperature setting. We found this to be the most accurate.
- Suggested model: Gemini-2.5-pro (specifically Gemini-2.5-pro-preview-06-05)
- Temperature: 0.25
- GOOGLE_API_KEY (see Getting set up)
Running the evaluator
Step 1: Choose test passages
- Personal data: Use only anonymous, general-purpose, informational text. Never use student information, real-world personal data, or any content subject to privacy regulations such as FERPA, COPPA, HIPAA, or GDPR.
- Passage length: The evaluator was built and optimized for using passages up to 1,200 words. It has not been tested on text longer than that.
- Text type: It was developed using the CLEAR corpus and Common Core Appendix B, and it will perform best on informational text. It may not be suitable for poetry and other similar text types.
Step 2: Make a single API call, providing the text you want to evaluate within a single prompt.
The evaluator is designed to analyze the text’s quantitative, qualitative, and background knowledge dimensions in one step.Step 3: Review the output.
The evaluator will return the following:- Target recommended grade band (e.g., Grade 4-5).
- An alternate grade band and required scaffolding to make the text suitable for that level.
- Rationale that includes assessment of qualitative, quantitative, and background knowledge requirements.
Run multiple times.We recommend running the evaluator 3 times per passage and using a voting mechanism to increase consistency.