> ## Documentation Index
> Fetch the complete documentation index at: https://docs.learningcommons.org/llms.txt
> Use this file to discover all available pages before exploring further.

# About this evaluator

> Reference documentation for the Sentence Structure Evaluator.

export const EvalGreenLgBadge = ({children}) => <div style={{
  marginBottom: "8px"
}}>
    <Badge color="gray" size="xs">
      {children != null && children !== "" ? children : null}
    </Badge>
  </div>;

export const EarlyAccessCallout = ({children}) => <div className="eyebrow-callout not-prose rounded-xl border border-gray-200/80 p-5 dark:border-white/10" style={{
  marginBottom: "1rem",
  borderRadius: "4px"
}}>
    <div className="mb-3">
      <Badge color="green" size="md" icon="flask">
        Early access
      </Badge>
    </div>
    <div className="callout-body text-[15px] leading-relaxed text-gray-700 dark:text-gray-300">{children}</div>
    <style>{`.callout-body a { text-decoration: underline; text-decoration-color: #178251; }`}</style>
  </div>;

[Evaluator last updated February 18, 2025.](#evaluator-release-history)

<EarlyAccessCallout>
  This functionality is actively evolving. Changes may occur as we expand capabilities and improve accuracy and reliability. Email [support@learningcommons.org](mailto:support@learningcommons.org) ↗ with your feedback or issues.
</EarlyAccessCallout>

## At a glance

|                      |                    |
| :------------------- | :----------------- |
| **Input type**       | Informational text |
| **Supported grades** | 3–12               |

The Sentence Structure Evaluator assesses the complexity of sentence structure in informational texts relative to a specified grade level. It:

* Identifies sentence features in the text, including sentence type composition, average words per sentence, subordinate clause ratios, and concepts per sentence.
* Assigns an overall complexity rating using an LLM, combined with statistical thresholds for sentence features.

## Model and prompt

For instructions on running the evaluator, see [Running an evaluator](/evaluators/using-evaluators/running-evaluators).

This evaluator runs as a two-step pipeline. Each step uses a different approach.

| Step 1: Sentence analysis |                                                                                                                            |
| ------------------------- | -------------------------------------------------------------------------------------------------------------------------- |
| Method                    | Python functions using [textstat](https://textstat.org/) ↗ + LLM for feature calculation, then deterministic normalization |

| Step 2 — Complexity assignment |                                                                                                                          |
| :----------------------------- | :----------------------------------------------------------------------------------------------------------------------- |
| Model used                     | GPT-4o                                                                                                                   |
| Temperature                    | 0                                                                                                                        |
| **Prompts**                    | [View prompts](https://github.com/learning-commons-org/evaluators/blob/main/evals/prompts/sent_str_prompts.py) ↗         |
| **Notebook**                   | [View notebook](https://github.com/learning-commons-org/evaluators/tree/main/evals/sentence_structure_evaluator.ipynb) ↗ |

<Note>
  Other configurations will produce different results and may have lower accuracy. The evaluator must be run in two stages — combining them into a single step reduces accuracy.
</Note>

## Inputs

| Requirement            | Supported                                                                      | Required |
| :--------------------- | :----------------------------------------------------------------------------- | :------- |
| **Target grade level** | Enables grade context evaluation                                               | Yes      |
| **Text type**          | Informational text<br />optimal length: 100–200 words (max \~1,200 characters) | Yes      |

## Output

|               | **Description**                                                                   |
| :------------ | :-------------------------------------------------------------------------------- |
| **Answer**    | Sentence structure complexity rating                                              |
| **Reasoning** | Explanation of the rating based on identified sentence features and grade context |

## Interpreting results

This evaluator returns one of the following ratings, along with reasoning for you to use to determine your best course of action. Complexity ratings are relative to the target grade level you provide.

| Rating                  | Meaning                                                                                                                                                                                                            |
| :---------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Slightly complex**    | Mainly simple, short sentences with very low subordination. In Grades 3–4, no advanced complex sentences are present.                                                                                              |
| **Moderately complex**  | A mix of simple and compound sentences with some complex constructions. In Grades 3–4, sentence length and subordination fall within moderate ranges. In Grades 5–12, no more than two advanced complex sentences. |
| **Very complex**        | Longer, more elaborate sentences with multiple clauses and high subordination. In Grades 5–12, three or more advanced complex sentences are present.                                                               |
| **Exceedingly complex** | Dense, intricate sentences with a high degree of subordination; sentences often contain multiple concepts. In Grades 5–12, 65% or more of sentences are advanced complex sentences.                                |

## Accuracy and validation

<Note>
  This evaluator is provided as Early access. Comprehensive accuracy measures are not yet available. Validation testing is ongoing.
</Note>

We assessed performance on an expert-annotated dataset of \~480 texts for Grade 3 and \~480 texts for Grade 4. Accuracy has been most extensively validated on Grades 3–4. We are still evaluating the performance for grades 5-12. For more information, see [Accuracy](/evaluators/using-evaluators/accuracy).

| Metric                                                                                                                                                                              | Result                                                                                                                          |
| :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------ |
| <Tooltip tip="The percentage of evaluated examples where at least one expert agreed with the evaluator's rating during review testing.">Expert agreement </Tooltip>                 | 53% agreement for Grade 3<br />54% agreement for Grade 4                                                                        |
| <Tooltip tip="The percentage of evaluator ratings that fall within the same complexity level or one level away from the of expert annotators'.">Accuracy within one-level</Tooltip> | 94%                                                                                                                             |
| <Tooltip tip="How the evaluator's accuracy compares to a simple, unrefined prompt.">Baseline comparison</Tooltip>                                                                   | 26% more accurate                                                                                                               |
| Dataset source                                                                                                                                                                      | [CLEAR Corpus](https://www.commonlit.org/blog/introducing-the-clear-corpus-an-open-dataset-to-advance-research-28ff8cfea84a/) ↗ |

## Evaluator release history

| Date               | Changed            |
| ------------------ | ------------------ |
| February 18, 2026  | Added grades 5-12. |
| September 23, 2025 | First release.     |
