Tutorial: Evaluating grade level appropriateness

What you’ll do

Run the evaluator in a Python environment as a function to evaluate one or many texts at once.

For comparison, we’ve also included walkthroughs for:

What you’ll need

Gemini 2.5 Pro is the recommended model. Other models have not been tested and may not be as accurate for determining the grade level.

Using the Gemini API to evaluate one text

You can automate the entire process in a Python script or notebook to analyze texts quickly and consistently. This section will guide you through setting up the environment and writing the code to perform the analysis.

Steps

STEP 1: Setting up the environment

First, follow the instructions in Set up your environment to create a Python virtual environment.
Sign up and receive a Gemini API Key from https://aistudio.google.com/.
Set up the key as an environment variable GOOGLE_API_KEY for security.
Install the necessary Python libraries:

pip install langchain-google-genai pandas

Open a new Jupyter notebook and add an empty code cell.

STEP 2: Running your evaluation

In the code cell, paste the following code snippet, replacing the placeholder lines.

// Add System and User prompts you used earlier, here:
You are the expert… with the system and user prompts from the previous sections of the tutorial.

import getpass
import os
import json
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser

if "GOOGLE_API_KEY" not in os.environ:
    os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter your Google AI API key: ")

# Initialize the Gemini model
llm = ChatGoogleGenerativeAI(model="gemini-2.5-pro", temperature=0.25)

# Define the detailed prompt template
prompt_template = """
// Add System and User prompts you used earlier, here:
You are an expert in English literature education for K-12...

**Text to Analyze:**
{text}

**Instructions:**
When providing your response, first think out loud of your reasoning and then provide your answer from one of the grade band options above. Your reasoning and answer need to be in JSON format. Strictly follow the following format for your response.

Return your complete analysis as a single JSON object with the following keys: "grade", "reasoning", "synthesis", "alternative_grade", "scaffolding_needed".
"""

# Create the prompt and parser
prompt = ChatPromptTemplate.from_template(template=prompt_template)
parser = JsonOutputParser()
chain = prompt | llm | parser

# Define the text to analyze
text_to_analyze = """
Long ago and far away in the Land of the Rising Sun, there lived together a pair of mandarin ducks. Now, the drake was a magnificent bird with plumage of colors so rich that the emperor himself would have envied it. But his mate, the duck, wore the quiet tones of the wood, blending exactly with the hole in the tree where the two had made their nest. One day while the duck was sitting on her eggs, the drake flew down to a nearby pond to search for food. While he was there, a hunting party entered the woods. The hunters were led by the lord of the district, a proud and cruel man who believed that everything in the district belonged to him to do with as he chose. The lord was always looking for beautiful things to adorn his manor house and garden. And when he saw the drake swimming gracefully on the surface of the pond, he determined to capture him. The lord’s chief steward, a man named Shozo, tried to discourage his master. “The drake is a wild spirit, my lord,” he said. “Surely he will die in captivity.” But the lord pretended not to hear Shozo. Secretly he despised Shozo, because although Shozo had once been his mightiest samurai, the warrior had lost an eye in battle and was no longer handsome to look upon. The lord ordered his servants to clear a narrow way through the undergrowth and place acorns along the path. When the drake came out of the water he saw the acorns. How pleased he was! He forgot to be cautious, thinking only of what a feast they would be to take home to his mate. Just as he was bending to pick up an acorn in his scarlet beak, a net fell over him, and the frightened bird was carried back to the lord’s manor and placed in a small bamboo cage.
"""

# Invoke the chain with the text
response = chain.invoke({"text": text_to_analyze})

# Print the formatted JSON response
print(json.dumps(response, indent=2))

Now, run the cell by pressing Ctrl+Enter. It may take a while for the results to appear. They will look like this:

{
  "alternative_grade": "2-3",
  "grade": "4-5",
  "reasoning": "1. **Quantitative Analysis**: The text has a word count of 288 words, ...",
  "scaffolding_needed": "This text is suitable for a 2-3 grade read-aloud with significant scaffolding. ..."
  "synthesis": "The quantitative measures are contradictory. The low word count points to lower elementary, while the high Flesch-Kincaid score points to middle school. The qualitative analysis ...",
}

STEP 3: Evaluating multiple texts at once

If you have several texts to analyze, you can save them as rows of a CSV file, then run the evaluator for each row automatically.

Combine your texts into a CSV file, one text in a row, with the escaped quotes if necessary.
Add the column label text in the first row. You can use the following script to add all the text files in the current directory to a CSV file:

import os
import csv
import glob

def combine_text_files_to_csv(output_csv_path):
    # Find all .txt files in the current directory
    txt_files = glob.glob("*.txt")
    txt_files.sort()
    with open(output_csv_path, 'w', newline='', encoding='utf-8') as csvfile:
        writer = csv.writer(csvfile, quoting=csv.QUOTE_MINIMAL)
        writer.writerow(['filename', 'text'])
        for txt_file in txt_files:
            try:
                with open(txt_file, 'r', encoding='utf-8') as file:
                    content = file.read()
                    filename = os.path.basename(txt_file)
                    # csv.writer automatically handles escaping of quotes, newlines, etc.
                    writer.writerow([filename, content])
            except Exception as e:
                print(f"Error reading {txt_file}: {e}")

if __name__ == "__main__":
    output_file = "texts.csv"
    combine_text_files_to_csv(output_file)
    print(f"Combined text files into '{output_file}'")

Save the script above as texts_to_csv.py.
Run the script python3 texts_to_csv.py in the directory containing the texts you want to evaluate that have been saved as plain text files (*.txt).
Place your CSV file texts.csv in the same directory as your notebook from step 2 above.
Add another empty code cell to the bottom of the notebook, then add the following code block:

import pandas as pd

df = pd.read_csv('texts.csv')
results = []

print(f"Starting analysis, it may take up to {len(df)} minutes...")
for index, row in df.iterrows():
    try:
        text_to_analyze = row['text']
        # Invoke the chain for the text in the current row
        response = chain.invoke({"text": text_to_analyze})
        results.append(response)
        print(f"Successfully analyzed row {index + 1}")
    except Exception as e:
        print(f"An error occurred while processing row {index + 1}: {e}")
        # Append an error message or empty dict to maintain row alignment
        results.append({"error": str(e)})

# Convert the list of result dictionaries into a DataFrame
results_df = pd.DataFrame(results)

# Concatenate the original DataFrame with the new results DataFrame
final_df = pd.concat([df, results_df], axis=1)

# Save the combined DataFrame to a new CSV file
final_df.to_csv('texts_with_analysis.csv', index=False)

print("\nAnalysis complete. Results saved to 'texts_with_analysis.csv'")

Run your combined text evaluator by hitting Ctrl-Enter. Evaluating a moderately long text can take up to a minute. The total waiting time, in minutes, will approximate the number of texts in the CSV file.

Results

By combining these code blocks, you can create a powerful Python notebook that analyzes any number of texts and returns a detailed, structured, and expert-level assessment of their grade-level appropriateness, fully automating the evaluation process.

The code examples above have been simplified for the purpose of this tutorial. For a better-structured evaluator code, please refer to the Python code section of each evaluator’s documentation.

Understanding Evaluators

Getting Started

Literacy Evaluators

Datasets

Resources

What you’ll do

What you’ll need

Using the Gemini API to evaluate one text

Steps

STEP 1: Setting up the environment

STEP 2: Running your evaluation

STEP 3: Evaluating multiple texts at once

Results

Understanding Evaluators

Getting Started

Literacy Evaluators

Datasets

Resources

​What you’ll do

​What you’ll need

​Using the Gemini API to evaluate one text

​Steps

​STEP 1: Setting up the environment

​STEP 2: Running your evaluation

​STEP 3: Evaluating multiple texts at once

​Results

What you’ll do

What you’ll need

Using the Gemini API to evaluate one text

Steps

STEP 1: Setting up the environment

STEP 2: Running your evaluation

STEP 3: Evaluating multiple texts at once

Results