Module 1: Bayesian Knowledge Tracing

KT Learning Lab 1: A Conceptual Overview

A Little History

The classic approach for measuring tightly defined skills in online learning
First proposed by Richard Atkinson
Most thoroughly articulated and studied by Albert Corbett and John Anderson Corbett and Anderson (1995)

Flexibility of BKT

Been around a long time
Still as of today the most widely-used knowledge tracing algorithm used at scale
- Interpretable
- Predictable
- Decent performance

Goal & Assumptions

The Key Goal of BKT

Measuring how well a student knows a specific skill/knowledge component at a specific time
Based on their past history of performance with that skill/KC

What is the typical use of BKT?

Assess a student’s knowledge of skill/KC X

Based on a sequence of items that are scored between 0 and 1
- Classically 0 or 1, but there are variants that relax this

Where each item corresponds to a single skill

Where the student can learn on each item, due to help, feedback, scaffolding, etc.

Key assumptions of BKT

Each item must involve a single latent trait or skill
- Different from PFA, which we’ll talk about next lecture

Each skill has four parameters

Only the first attempt on each item matters
- i.e. is included in calculations

Help use usually treated as same as incorrect
- Some exceptions I will discuss later

Key assumptions of BKT Cont.

Each skill has four parameters
From these parameters, and the pattern of successes and failures the student has had on each relevant skill so far

We can compute
- Latent knowledge P(L_n)
- The probability P(CORR) that the learner will get the item correct

Key assumptions of BKT

Two-state learning model

Each skill is either learned or unlearned

In problem-solving, the student can learn a skill at each opportunity to apply the skill
Each problem (opportunity) has the same chance of learning.

A student does not forget a skill, once he or she knows it

Model Performance Assumptions

If the student knows a skill, there is still some chance the student will slip and make a mistake.

If the student does not know a skill, there is still some chance the student will guess correctly.

Comments or Questions?

The BKT Model

Model Parameters & Predicting Correctness

Learning Parameters

Two Learning Parameters

p(L₀). Probability the skill is already known before the first opportunity to use the skill in problem solving.

p(T). Probability the skill will be learned at each opportunity to use the skill.

Learning Parameters

Two Learning Parameters

p(L₀). Probability the skill is already known before the first opportunity to use the skill in problem solving.

p(T). Probability the skill will be learned at each opportunity to use the skill.

Performance Parameters

Two Performance Parameters

p(G). Probability the student will guess correctly if the skill is not known.

p(S). Probability the student will slip (make a mistake) if the skill is known.

Performance Parameters

Two Performance Parameters

p(G). Probability the student will guess correctly if the skill is not known.

p(S). Probability the student will slip (make a mistake) if the skill is known.

Comments? Questions?

Predicting Current Student Correctness

PCORR = P(L_n)P(S)+P(~L_n)P(G)

Bayesian Knowledge Tracing

Whenever the student has an opportunity to use a skill
The probability that the student knows the skill is updated
Using formulas derived from Bayes’ Theorem

In Bayesian Knowledge Tracing, when the student has an opportunity to use a skill, the probability that the student knows the skill is updated using formulas derived from Bayes’ Theorem.

Formulas

\[ P(L_{n-1}|Correct_{n}) = \frac{P(L_{n-1})*(1-P(S))}{P(L_{n-1})*(1-P(S))+(1-P(L_{n-1}))*P(G)} \\\\\\ P(L_{n-1}|Incorrect_{n}) = \frac{P(L_{n-1})*(P(S))}{P(L_{n-1})*(P(S))+(1-P(L_{n-1}))*(1-P(G))} \\\\\\ P(L_{n}|Action_{n}) = P(L_{n-1}|Action_{n}) +(1- P(L_{n-1}|Action_{n}) * p(T)) \]

The formulas are as follows:

The probability that you knew it beforehand, given that you got it correct, is the probability that you knew it beforehand and didn’t slip over the probability you knew it and didn’t slip, plus the probability that you didn’t know it and you guessed.
Similarly, if you got it wrong, then how did you get it wrong? You must have known it and slipped you already knew it. So if you already knew it, then you previously knew it and you slipped, and the two possibilities are you previously knew it and you slipped, and you didn’t know it and you didn’t guess.
Finally, once we know the probability that they knew it beforehand, given their correctness now, we can look at whether they learned it. So the probability that they know it at time n given action n, so after the action, is the probability they know it before the action, plus the probability they didn’t know it before the action, times the probability that they learned it. In other words, let’s say there’s a 30% chance you knew it after the previous action and a 10% chance you learned it. P of T is 10%. In that case, the probability you knew it afterward will be 0.3 plus 0.7, the probability you didn’t know it, times 0.1 for 0.37.