Overview
This tutorial is going to walk you through using Knowledge Graph to support an advanced and highly desired use case: “For a given standard, I want to know what the pre-req(s) are, to create a differentiated content or product experience.” When it comes to this use case of working with learning progressions data, there are a few key things to know in advance:- Our current learning progressions dataset, coming from Student Achievement Partners, map Common Core State Standards for Mathematics into logical sequences.
- The sequences do not name definitive pre-reqs. In other words, it is not necessarily true that students must master an earlier standard before they will be ready for standards it supports.
- Instead, the relationships shown indicate what might be helpful in a given circumstance.
Key demonstrated capabilities
- Navigating learning progressions relationships
- Unpacking standards into learning components
- Inserting Knowledge Graph data into LLM context for content generation
Prerequisites
- This tutorial assumes you’ve downloaded Knowledge Graph already. If you haven’t please see download instructions.
- You can use either Node or Python to go through this tutorial
- If using Node
node 14+
openai
dotenv
arquero
csv-parse
- If using Python
python 3.9+
openai
pandas
python-dotenv
- OpenAI API key
Step 1: setup
Load dependencies and env variables
First you’ll need to set up your environment variables for an LLM (this tutorial assumes OpenAI) and either the data files you have downloaded or a PostgreSQL database. This tutorial assumes your variable names look like the following:Read and filter data files
Let’s load in the relevant files and specific data now that will be used to explore prerequisite standards, learning components, and ultimately generate practice content with OpenAI. If you completed the previous tutorial, this one will feel a bit more advanced. The main difference is that instead of loading all the standards data, we’ll focus only on a specific subset of standards. Here’s what we’ll do:- Load only the
StandardsFrameworks
andStandardsFrameworkItems
that are part of the Common Core Math Standards. - Pull in the
hasChild
relationships that connect those standards. - Include the
LearningComponents
that support theStandardsFrameworkItems
we just loaded.
Step 2: find prerequisites
Get prerequisite standards for 6.NS.B.4
Now that the data is loaded, let’s get a target standard and filter through the relationshipbuildTowards
to find prerequisite standards.
Get the learning components that support the prerequisite standards
Now, we’re going to filter the relationships table to find the learning components thatsupports
the prerequisite standards that we found in the previous step. Please note, description_2
is auto-generated by the arquero library with _n
being appended when multiple tables contain the same column name.
Packaging the prerequisite functions
Step 3: generate practice
Now that you’ve identified learning components and prerequisite standards, you can use those for downstream applications, such as generating practice problems. But, remember the caveats discussed above and apply your judgement to create appropriate learning experiences.Package the data
Let’s create clean JSON to package the data so it will be easily parsable by the LLM and maintain the relationship structure of the data.Generate practice questions
Finally, we’ll inject that JSON into a prompt so the LLM has full context of what the user is looking to create practice problems for.Step 4: pulling it all together
Now let’s create a final function to run everything.Conclusion
In this tutorial you started with standards data, followed thebuildsTowards
relationships to see what comes before a target standard, joined in the learning components that support
those prerequisites, and then structured that context to generate new practice problems. Everything here was scoped to a single standard for clarity, but the same steps work across grade levels, subject areas, or even larger parts of the dataset. As you keep experimenting, try extending the queries to include lessons, assessments, or instructional routines to create more complete learning experiences.