Teaching | The Murphy Lab

Spring 2026: Explainable AI (CS 785)

The PhD version of Explainable AI will follow the same trajectory as what is listed below for CS485/698. There will be deeper engagement with the math and the research articles behind the methods we cover.

The syllabus can be found here.

Fall 2025: Explainable AI (CS 485/698)

This course introduces technical methods for making machine learning models more transparent and understandable. Topics include intrinsically interpretable models, post hoc explanations (e.g., Shapley values, saliency maps), visualization of model internals (e.g., attention maps, neuron activations), surrogate modeling, mechanistic interpretability, and communication bottlenecks. Visualization is a central theme throughout the course, both as a practical tool and as a key research frontier.

Course Schedule

Week	Topic
1	Introduction, Linear models
2	Decision trees, ensembles
3	Feature importance: partial dependence, permutation, Shapley values
4	Following information usage (SAGE, information bottleneck)
5	Surrogate modeling, exemplars, counterfactuals
6	Gradient-based methods for saliency, adversarial examples
7	CNNs and feature visualization
8	CLIP+Concept Bottleneck Models, Midterm
9	Attention: the good and the bad
10	Mechanistic interpretability
11	Jailbreaking, intervening, chain of thought
12	Automated interpretability, evaluation of explanations
13	Zoom out: current trends in research, policy
14	Student presentations
15	Final exam

The syllabus can be found here.