Literacy dataset

Early Release

This evaluator reflects early-stage work. We’re continuously improving its accuracy and reliability.

About the dataset

This dataset provides text complexity annotations of the CLEAR (CommonLit Ease of Readability) Corpus via literacy experts and qualified educators. It is the benchmark data Evaluators use for evaluating literacy levels in AI-generated text. We are sharing it as a new resource for the learning science community to help fill the need for more high-quality text complexity datasets and to complement existing work in this area. The CLEAR Corpus was produced by CommonLit in collaboration with Georgia State University and released in December 2021. It comprises nearly 5000 publicly available excerpts, each mapped against dimensions including Flesch Kincaid and BT Easiness (Bradley-Terry coefficient based on teacher ratings of the text). We expanded the dataset by scoring a subset of rows for text complexity dimensions found in the SCASS Rubric for Informational Text from Student Achievement Partners. Our initial release in September 2025 focuses on Grades 3 and 4 across sentence structure and vocabulary, but we plan to expand to all grades and all dimensions of text complexity assessed through the SCASS rubric. Thanks to Student Achievement Partners and to Achievement Network for their contributions in helping us assemble this annotated data.

Creating the annotated data

Our process for producing annotated data is as follows:

Filter the CLEAR corpus to an approximate grade 3-4 range using Flesch Kincaid Grade Level.
Partner with literacy experts from SAP (Student Achievement Partners) and ANet (Achievement Network) to score against text complexity dimensions on the SCASS rubric.
With SAP and ANet, establish a gold set of ~80 examples per grade with representation across the four tiers of text complexity: slightly, moderately, very, and exceedingly complex.
Use the gold set to test and qualify a cohort of educators with a minimum of two years of experience teaching ELA at the corresponding grade level.
Produce a minimum of 200 rows (at 50 per complexity tier), calibrating annotator scores using the Dawid-Skene method.

Our annotation process

flowchart showing a high-level overview of the annotation process

Finally, we package the dataset by mapping multiple dimensions of text complexity to the clear_id column. This results in an annotated dataset that can be easily merged with the CLEAR corpus.

Columns in our annotated dataset

The following columns can be found in our dataset: Note that these columns refer to the annotated dataset as of September 23, 2025. This list will be updated as additional dimensions of text complexity are incorporated.

UID: Unique identifier for each row.
Clear ID: Identifier for texts based on the CLEAR corpus. This is not a unique identifier, as some texts were scored for multiple grades.
Grade: Grade-level for which the text is scored. For example, if Grade=3 and Sentence Structure Complexity Score Slightly Complex, then the text is Slightly Complex for a third-grade student (see overall project documentation for specific assumptions).
Flesch Kincaid: Flesch-Kincaid Grade Level score for the text, provided from the CLEAR corpus.
Text: Text from the CLEAR corpus that was annotated.
Sentence Score: Overall annotator rating for sentence structure complexity. Takes the values slightly complex, moderately complex, very complex, and exceedingly complex. See the technical docs for additional details on these categories.
Sentence Score Rationale: Annotators’ explanations for their sentence structure score.
Vocabulary Score: Overall annotator rating for vocabulary complexity. Takes the values slightly complex, moderately complex, very complex, and exceedingly complex. See overall project documentation for additional details on these categories.
Vocabulary Score Rationale: Annotators’ explanations for their vocabulary score.
Tier 2 Words: Tier 2 words identified by annotators.
Tier 3 Words: Tier 3 words identified by annotators.
Archaic Words: Archaic words identified by annotators.
Other Complex Words: Additional complex words for students of the grade level, as identified by annotators.
Background Knowledge Assumption: LLM-generated information on the background knowledge that students of a particular grade are likely to have about a topic. Information was provided to annotators as part of the scoring process. See the overall project documentation for detailed methodology on how this was generated.

Missing data code

Not Scored: Text was not annotated for this column.

Definitions

Term	Meaning	Example
Tier 2 Words	Words that are commonly used in academic settings and are more complex than colloquial, or everyday language, and often have multiple meanings	For Grade 3 text: “There are eight planets in the Solar System. From closest to farthest from the Sun, they are: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.”Most planets (Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranu,s and Neptune) are tier 2 words.
Tier 3 Words	Words that are limited to a specific domain or that are so rare that an avid reader would likely not encounter them in a lifetime.	Domain-specific example: enzyme Rare unconventional example: abecedarian
Archaic Words	Words, or uses of words that are not commonly used in modern conversational language.	The jury retired to deliberate on their verdict.” The use of retire to mean withdrawing to a private place is an archaic use.

For more information

If you’d like to chat with our team about the dataset, the methodology, or plans for the future, you can get in touch with us at [email protected]. Alternatively, if you’re a developer who wants to follow our work or join the private beta for early access to our new evaluators, contact us.

Understanding Evaluators

Getting Started

Literacy Evaluators

Datasets

Resources

Early Release

About the dataset

Creating the annotated data

Our annotation process

Columns in our annotated dataset

Missing data code

Definitions

For more information

Understanding Evaluators

Getting Started

Literacy Evaluators

Datasets

Resources

Early Release

​About the dataset

​Creating the annotated data

​Our annotation process

​Columns in our annotated dataset

​Missing data code

​Definitions

​For more information

About the dataset

Creating the annotated data

Our annotation process

Columns in our annotated dataset

Missing data code

Definitions

For more information