Skip to main content
v0.4.0

What you’ll do

Evaluate a batch of text from a CSV file using all literacy evaluators. Results are output in both CSV and HTML format.

What you’ll need

  • Install the SDK globally
      npm install -g @learning-commons/evaluators
    
  • Create a CSV file with the text you want to evaluate
    • Must be 50 or fewer input rows (unless using the --bypass-row-limit option)
    • Must have text and grade columns
    • May include additional columns (will be preserved as-is in the output)
example.csv
text,grade
"The cat sat on the mat.",3
"Photosynthesis is the process by which plants convert sunlight into energy.",5
"The mitochondria are the powerhouse of the cell.",8

Running the batch evaluator

Run the batch evaluator using npx from any directory:
npx evaluators-batch
You will be prompted for the following information:
  • CSV file path
  • Google and OpenAI API keys
    • Copy and paste directly in terminal window
    • Alternatively, provide as environment variables (GOOGLE_API_KEY and OPENAI_API_KEY, by default)
  • Output directory
    • Defaults to a folder in the current directory with a human-readable timestamp (e.g. batch-results-2024-02-07_14-30-22/)

Options

Pass in options to override the batch evaluator’s defaults:
evaluators-batch --concurrency 5 --max-retries 3 --no-telemetry
OptionDefaultDescription
--concurrency <n>3Number of evaluations to run in parallel. If you have higher rate limits with your provider and model, you can raise this value for faster execution
--max-retries <n>2Number of times to retry a failed evaluation
--no-telemetryTelemetry is enabledDisable telemetry data collection
--bypass-row-limit
v0.6.0
trueEvaluates a CSV file with more than 50 rows

Results

You’ll see a real-time display of the batch evaluator’s progress:
Processing evaluations...
████████████░░░░░░░░ 60% (30/50)
  ✓ grade-level-appropriateness: 6/10 successful
  ✓ subject-matter-knowledge: 6/10 successful
  ✓ vocabulary: 6/10 successful
  ✓ sentence-structure: 6/10 successful
  ⏳ conventionality: 6/10 successful

⏱  Elapsed: 2m 15s | Estimated remaining: 1m 30s
The batch evaluator will generate 2 files in your output directory:
batch-results-2024-02-07_14-30-22/
├── results.csv
└── results.html
results.csv
  • Spreadsheet-compatible format
  • Original CSV columns preserved
  • New CSV columns for each evaluator
    • {evaluator}_score
    • {evaluator}_reasoning
    • {evaluator}_status
results.html
  • Summary dashboard with grade-level distribution and text complexity charts
  • Scores and reasoning for each evaluator
If any evaluations fail (even after retries), only those rows will error out. The batch evaluator will skip those rows and then ultimately surface those failures in the results with an error status.

Graceful shutdown

If you press Ctrl+C during evaluation:
  • In-flight evaluations finish processing
  • Pending tasks are cancelled
  • Completed results are saved to results-partial.* files to preserve progress
⚠️  Shutdown requested. Saving partial results...
   (Press Ctrl+C again to force quit)

 Saved 15 results to:
  ./batch-results-2024-02-07_14-30-22/
    ├── results-partial.csv
    └── results-partial.html
If you press Ctrl+C twice to force quit immediately, you may lose in-flight results.