Early Release
This evaluator reflects early-stage work. We’re continuously improving its accuracy and reliability.
Original SCASS qualitative sentence structure rubric
We started with the SCASS rubric and modified it to be more usable for annotation. Then, after data collection, we performed extensive statistical analysis to identify the key sentence features that impact annotators’ scores. This resulted in a machine-compatible rubric, which we used to build our final evaluator.Language features | |||
---|---|---|---|
Conventionality | Vocabulary | Sentence structure | |
Slightly complex | Explicit, literal, straightforward, easy to understand. | Contemporary, familiar, conversational language. | Mainly simple sentences. |
Moderately complex | Largely explicit and easy to understand, with some occasions for more complex meaning. | Mostly contemporary, familiar, conversational; rarely overly academic. | Structure: Primarily simple and compound sentences, with some complex structures. |
Very complex | Fairly complex; contains some abstract, ironic, and/or figurative language. | Fairly complex language that is sometimes unfamiliar, archaic, subject-specific, or overly academic. | Many complex sentences with several subordinate phrases or clauses and transition words. |
Exceedingly complex | Dense and complex; contains considerable abstract, ironic, and/or figurative language. | Complex, generally unfamiliar, archaic, subject-specific, or overly academic language; may be ambiguous or purposefully misleading. | Mainly complex sentences with several subordinate clauses or phrases and transition words; sentences often contain multiple concepts. |
Modified SCASS rubric with assumptions for annotation
With substantial input from experts, we created an adapted rubric with accompanying assumptions for annotators to use. Annotators provided their scores based on this rubric.Modified rubric
Slightly complex | Moderately complex | Very complex | Exceedingly complex |
---|---|---|---|
Mainly simple sentences, few sentences contain multiple concepts. | Primarily, simple and compound sentences, with some complex (or compound-complex) constructions, some sentences contain multiple concepts. | Many complex (or compound-complex) constructions with several subordinate phrases or clauses and transition words; many sentences contain multiple concepts. | Mainly complex (or compound-complex) constructions with several subordinate clauses, phrases, and transition words; most sentences contain multiple concepts. |
Annotation assumptions
Student assumption | Text assumption | What to score vs ignore | Sentence structure complexity is relative |
---|---|---|---|
The student is on grade level and proficient in all core content areas, including reading fluency, comprehension, science, & social studies. The student is moving through a common progression of topics. The student is fluent in English. The student is in the middle of the academic year. | The text is for independent reading/work, without direct instruction. | When scoring Sentence Structure complexity, please ignore all other text features such as vocabulary, background knowledge, topic, etc. A text may be less readable for a student because of vocabulary and background knowledge, but still be composed of mostly simple sentences. In this case, the sentence structure is still Slightly Complex. | Sentence Structure complexity is relative to the grade level of the student. |
Human annotation
We created a full, reliable benchmark dataset based on 500+ text passages from the CLEAR corpus, which were in turn annotated for sentence structure. The dataset consists of informational topics where the Flesch–Kincaid grade level is lower than 9. This allowed us to capture texts that were appropriate for students in grades 3 and 4, while also including a few more difficult texts with “Very Complex” and “Exceedingly Complex” ratings. The dataset is composed of two parts:- Expert-annotated data
- ~80 texts (160 rows of data) were annotated by at least two pedagogical experts from SAP and ANET.
- If the two experts provided different scores, a third expert would also provide a score.
- In total, we worked with eight pedagogical experts from SAP and ANET, all of whom had prior experience in literacy or curriculum development.
- Educator-annotated data
- ~400 texts (800 rows of data) were annotated by three educators who had passed a pre-test.
- Educators were also given some “honeypot” texts to score. These are texts with 2+ expert agreement in scores, and we used these texts to track each educator’s agreement with experts.
- We used the Dawid-Skene model to calibrate the final score for educators.
- In total, we worked with 21 educators.
Machine-compatible rubric development
After we finalized the benchmark dataset, we conducted extensive data analysis:- We calculated the F-statistic of 30+ sentence features, which allowed us to identify the most important sentence features for a text’s sentence structure complexity
- We used tree-based models to identify the thresholds that made a text fall within a particular category (e.g., average words per sentence < 12 for “Slightly Complex”).
Grade Level | Slightly complex | Moderately complex | Very complex | Exceedingly complex |
---|---|---|---|---|
Grade 3 | The text consists of simple, straightforward language and sentence structures. The text is likely slightly complex if it meets at least two of the following criteria: Sentence type: Primarily simple sentences (typically > 60% simple sentences). Sentence length: Short sentences (typically <12 average words per sentence). Subordination: Very low use of subordinate clauses (typically <25% of sentences have subordinate clauses). | The text shows a mix of simple and more complex sentences, introducing some variety in structure without being overly demanding. If the text is not slightly complex, then consider if it is moderately complex based on the following ranges: Sentence type: Balanced mix of sentence types (typically between 40 to 60% simple sentences). Sentence length: Medium-length sentences (typically between 12 and 16 average words per sentence). Subordination: Moderate use of subordinate clauses (typically 25 to 45% of sentences have subordinate clauses). | The text features more elaborate sentences with multiple clauses and ideas, requiring more effort from the reader to parse. Consider if a text is very complex based on the following rates: Sentence type: Most sentences are complex (<40% of sentences are simple sentences). Sentence length: Longer sentences (typically between 16 to 19 average words per sentence). Subordination: High use of subordinate clauses (typically >45% of sentences have subordinate clauses). | The text is dense with very long, intricate sentences and a high degree of subordination, making it exceptionally challenging for this grade level. The text is likely exceedingly complex if it meets at least two of the following criteria, including at least one from the structural density group: Structural density: Subordination: >50% of sentences have subordinate clauses. Multiple subordination: >12% of sentences have more than one subordinate clause. Syntactic complexity: >15% of students are compound-complex. Length: Sentence length: Very long sentence length (typically >19 average words per sentence). High concentration of very long sentences: >15% of sentences have >=30 words. |
Grade 4 | The text consists of simple, straightforward language and sentence structures. The text is likely slightly complex if it meets at least two of the following criteria: Sentence type: Primarily simple sentences (typically > 55% simple sentences). Sentence length: Short to medium sentences (typically <13 average words per sentence). Subordination: Very low use of subordinate clauses (typically <30% of sentences have subordinate clauses). | The text shows a mix of simple and more complex sentences, introducing some variety in structure without being overly demanding. If the text is not slightly complex, then consider if it is moderately complex based on the following ranges: Sentence type: Balanced mix of sentence types (typically between 40 to 55% simple sentences). Sentence length: Medium length sentences (typically between 13 to 17 average words per sentence). Subordination: Moderate use of subordinate clauses (typically 30 to 50% of sentences have subordinate clauses). | The text features more elaborate sentences with multiple clauses and ideas, requiring more effort from the reader to parse. Consider if a text is very complex based on the following rates: Sentence type: Most sentences are complex (<40% of sentences are simple sentences). Sentence length: Longer sentences (typically between 17 to 22 average words per sentence). Subordination: High use of subordinate clauses (typically >50% of sentences have subordinate clauses). Multiple subordination: >8% of sentences have more than one subordinate clause. | The text is dense with very long, intricate sentences and a high degree of subordination, making it exceptionally challenging for this grade level. The text is likely exceedingly complex if it meets at least two of the following criteria, including at least one from the structural density group: Structural density: Subordination: >60% of sentences have subordinate clauses Multiple subordination: >15% of sentences have more than one subordinate clause Syntactic complexity: >20% of students are compound-complex Length: Sentence length: Very long sentence length (typically >22 average words per sentence) High concentration of very long sentences: >15% of sentences have >=30 words |