Explainable AI (xAI) and Interpretable AI in Education

Artificial Intelligence (AI) is reshaping various fields, including education, but its complexity often leads to questions about trust, fairness, and transparency. In response, two crucial concepts have emerged: Explainable AI (xAI) and Interpretable AI. These terms may sound similar, but they offer distinct affordances and are critical in educational contexts where understanding predictions can drive better learning outcomes.

What is Explainable AI (xAI)?

Explainable AI refers to models where the specific reasons behind a prediction can be explained to a human. With xAI, you can use any algorithm—no matter how complex—as long as you can explain why the model made a specific decision afterward.

Example in Education:
Imagine an AI system that predicts whether a student will pass a standardized exam. While the algorithm might involve complex techniques like deep learning, xAI allows educators to understand why the model predicted that a particular student is at risk. Perhaps it was due to lower homework completion rates or poor performance in specific topics. The AI doesn’t need to be simple, but the prediction must be understandable.

What is Interpretable AI?

Interpretable AI, on the other hand, focuses on models where the entire process leading to the prediction can be understood by humans. Interpretable models are transparent from the start, meaning you can see how inputs are transformed into outputs.

Example in Education:
Consider a simpler linear regression model that predicts student performance based on variables like study time, attendance, and participation. With Interpretable AI, educators not only get predictions but also understand the entire mechanism of how these variables contribute to the outcome. This understanding helps in identifying instructional improvements and adjusting teaching strategies.

Key Differences Between Explainable and Interpretable AI

Both xAI and Interpretable AI play vital roles but serve different needs. With xAI, any algorithm can be used (including black-box models like deep neural networks) as long as you can explain the final prediction. In contrast, Interpretable AI often relies on models that are inherently transparent but may limit the complexity of the algorithms used.

Some algorithms, such as deep learning models, can be difficult to interpret even for experts, while simpler models like decision trees are naturally more interpretable.

What Makes a Model Interpretable?

For a model to be considered interpretable, humans need to know how it works, and they need to understand why the model is better than simpler alternatives (Liu & Koedinger, 2017). A few factors facilitate interpretability:

  • Interpretable and meaningful predictors: Predictors should be variables that can be understood easily and grounded in theory.
  • Well-defined output: The predicted outcome must be clear, allowing for actionable insights.
  • Advances in understanding: Interpretable models should help advance our understanding of how learners acquire knowledge or provide insights that can improve instruction.

Interpretable models tend to be parsimonious—they balance complexity and simplicity, using just enough variables to make accurate predictions without being overly complicated.

When Models Become Uninterpretable

Some models are so complex that they become uninterpretable, even though they may predict better. These uninterpretable models do not help us understand how learners learn or provide meaningful insights for improving instruction.

For example, deep learning models with thousands of parameters can make highly accurate predictions but offer little transparency into how the variables work together. Even if the individual features are meaningful, the combinatorial complexity of how they interact can make the model difficult for humans to understand.

What Makes a Predictor Interpretable and Meaningful?

One way to judge whether a predictor is interpretable is by determining if we can tell a plausible narrative about how that predictor influences the outcome. If you can tell a coherent story—often referred to as a “just-so story”—about the relationship between a predictor and the predicted variable, the predictor is relatively interpretable.

Methods of Explainable AI (xAI)

There are numerous methods used in xAI to explain model predictions. Some of the most common include:

  1. Contribution of Predictor:
    • Q: How much worse does the model perform without a specific predictor?
    • Q: What proportion of models in an ensemble contain the prediction? (This is used in Random Forests).
    • SHAP values: This method calculates the average contribution of a predictor across all possible scenarios.
  2. Sensitivity Analysis:
    • Involves changing each predictor slightly to check which changes most affect the prediction? – LIME (Local Interpretable Model-agnostic Explanations)
    • Identifying which predictor values cannot change, else the prediction would change – CEM (Contrastive Explanation Method)
    • What are the smallest changes to the predictors that would alter the prediction? – DiCE (Diverse Counterfactual Explanations):
  3. Layer-wise Relevance Propagation (LRP):
    • This method is specific to neural networks and works by running the network backward to analyze which inputs were most relevant to the final output. Q: Which specific predictors/values stay the same as the original data?

Variability in xAI Methods

It’s important to note that different xAI methods don’t always produce the same answers (Swamy et al., 2022). For example, in their study, LIME has been found to give the most outlier of the metrics studied. This highlights the importance of choosing the right explainability metrics — which hugely should be driven by what insights you want to know/extract.

Interaction Effects in xAI

One limitation of traditional xAI methods is that they don’t always account for interaction effects—how different predictors combine to influence an outcome. However, extensions like TreeSHAP and Non-tree SHAP have been developed to handle interaction effects.

Note: If you incorporate interaction terms into LIME  as potential features, it will incorporate them into the linear model it creates. You can therefore hack interaction effects out of LIME.

Finally

Both Explainable AI and Interpretable AI are essential for educational applications, offering different benefits depending on the use case. When implementing AI in educational settings, it’s critical to strike a balance between predictive power and transparency, ensuring that models not only perform well but also offer actionable insights to improve learning outcomes.

Reference:

Baker, R.S. (2024) Big Data and Education. 8th Edition. Philadelphia, PA: Universiwty of Pennsylvania.