> ## Documentation Index
> Fetch the complete documentation index at: https://docs.learningcommons.org/llms.txt
> Use this file to discover all available pages before exploring further.

# About this evaluator

> Reference documentation for the Conventionality Evaluator.

export const EarlyAccessCallout = ({children}) => <div className="eyebrow-callout not-prose rounded-xl border border-gray-200/80 p-5 dark:border-white/10" style={{
  marginBottom: "1rem",
  borderRadius: "4px"
}}>
    <div className="mb-3">
      <Badge color="green" size="md" icon="flask">
        Early access
      </Badge>
    </div>
    <div className="callout-body text-[15px] leading-relaxed text-gray-700 dark:text-gray-300">{children}</div>
    <style>{`.callout-body a { text-decoration: underline; text-decoration-color: #178251; }`}</style>
  </div>;

[Evaluator last updated March 20, 2026.](#evaluator-release-history)

<EarlyAccessCallout>
  This functionality is actively evolving. Changes may occur as we expand capabilities and improve accuracy and reliability. Email [support@learningcommons.org](mailto:support@learningcommons.org) ↗ with your feedback or issues.
</EarlyAccessCallout>

## At a glance

|                      |                    |
| :------------------- | :----------------- |
| **Input type**       | Informational text |
| **Supported grades** | 3–12               |

The Conventionality Evaluator assesses **how directly a text communicates its meaning**. It analyzes whether language is literal and explicit or relies on figurative, abstract, or implied meaning that requires interpretation.

## Model and prompt

For instructions on running the evaluator, see [Running an evaluator](/evaluators/using-evaluators/running-evaluators).

|                     |                                                                                                                       |
| :------------------ | :-------------------------------------------------------------------------------------------------------------------- |
| **Model used**      | gemini-3-flash-preview                                                                                                |
| **Temperature**     | 0                                                                                                                     |
| **Prompts**         | [View prompts](https://github.com/learning-commons-org/evaluators/tree/main/evals/prompts/conventionality/) ↗         |
| **Python notebook** | [View notebook](https://github.com/learning-commons-org/evaluators/blob/main/evals/conventionality_evaluator.ipynb) ↗ |

<Note>
  Other configurations will produce different results and may have lower accuracy.
</Note>

## Inputs

| Requirement            | Supported                                               | Required |
| ---------------------- | ------------------------------------------------------- | -------- |
| **Target grade level** | Enables grade context evaluation                        | Yes      |
| **Text type**          | Informational text<br />Optional length 200-1,000 words | Yes      |

## Output

| Field                    | Description                                                                                                |
| ------------------------ | ---------------------------------------------------------------------------------------------------------- |
| Complexity rating        | Conventionality complexity level                                                                           |
| Reasoning                | Explanation of the rating based on language features                                                       |
| Conventionality features | Specific language features driving complexity (for example, idioms, metaphors, irony, or implicit meaning) |
| Grade context            | Comparison of conventionality demands with expectations for the provided grade                             |
| Instructional insights   | Suggestions for scaffolding or teaching unconventional language features                                   |

## Interpreting results

The evaluator returns one of the following ratings, along with reasoning, to help you interpret the conventionality demands of the text.

| Rating              | Meaning                                                                                      |
| ------------------- | -------------------------------------------------------------------------------------------- |
| Slightly complex    | Language is literal and explicit. Meaning is directly stated.                                |
| Moderately complex  | Mostly literal language with occasional figurative or implicit meaning.                      |
| Very complex        | Frequent figurative language or implied meaning requires interpretation.                     |
| Exceedingly complex | Language relies heavily on abstraction, layered meaning, or sustained figurative expression. |

More complex ratings indicate texts that **require greater interpretive effort** from readers.

## Accuracy and validation

<Note>
  This evaluator is provided as Early access.\
  Comprehensive accuracy measures are still evolving, and validation testing is ongoing.
</Note>

The evaluator was optimized using **35 annotated passages** and validated through expert review of additional samples.

| **Metric**                                                                                                                                                         | **Result**                                                                                                 |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------- |
| <Tooltip tip="How accurately the evaluator determines conventionality complexity compared to expert annotations.">Complexity score accuracy</Tooltip>              | 83% agreement with expert annotations                                                                      |
| <Tooltip tip="The percentage of evaluated examples where at least one expert agreed with the evaluator's rating during review testing.">Expert agreement</Tooltip> | 90% (9 of 10 examples approved)                                                                            |
| <Tooltip tip="Expert rating of how well the evaluator's reasoning explains the complexity decision, scored on a 1–5 scale.">Reasoning soundness</Tooltip>          | Average 4.4 / 5                                                                                            |
| Dataset source                                                                                                                                                     | [CLEAR Corpus](https://docs.google.com/spreadsheets/d/1sfsZhhP2umXXtmEP_NRErxLuwgN98TyH7LWOq3j07O0/edit) ↗ |

## Evaluator release history

| Date           | Changed        |
| -------------- | -------------- |
| March 20, 2026 | First release. |
