Early Release
This evaluator reflects early-stage work. We’re continuously improving its accuracy and reliability.
- The evaluator is calibrated for use within a defined configuration environment. Its performance reliability has not been established outside these boundaries, and use with alternate model settings or prompt formats may lead to inconsistent results.
- It may not perform reliably for texts aimed at lower or higher grade levels than the intended grades 3-4.
- Validation focused on informational passages ~100-200 words; performance estimates are most precise within this range. Performance has not been formally validated on very short (under 100 words) or longer (over 200 words) texts.
- This evaluator addresses only the sentence structure dimension of text complexity.
- Some variability is inherent in LLM outputs. Occasional inconsistencies in labeling or scoring may occur across runs.
- The evaluator is intended for exploratory use only. It is not validated for formal instructional placement, assessment, or other high-stakes educational decisions.
- Results should be interpreted with human judgment, especially when informing curriculum development, educational interventions, or product design.
- The evaluator is intended for deidentified, general-purpose text inputs only. Users should not submit student information, real-world personal data, or any content subject to privacy regulations such as FERPA, COPPA, HIPAA, or GDPR.