Debugging and Interpreting Machine Learning Models

Machine learning models are increasingly used in critial real-world applications such as self-driving cars, loan processing, fake news detection, and more. However these models are highly complex and have a reputation for being “black boxes” – when they make a prediction, it is unclear how the decision was made. Similarly, it is not clear what the model is using to make a prediction and how changes in the data would affect its predictions.

To this end, our lab develops algorithms to interpret complex machine learning models (e.g., deep neural networks, random forests, etc) by identifying training data that affected a prediction, describing what parts of the model are learning, and how user generated inputs can be improved to better help the model.

Deep Neural Inspection

Deep neural networks are revolutionizing many domains and increasingly employed in production and real-world environments. Yet, how do we ensure that learned models behave reliably and as intended? Software engineering principles such as abstraction and modularity help us build and understand reliable systems by principled construction. Yet neural networks are black boxes akin to a block of assembly code.

The Deep Neural Inspection (DNI) project aims to develop software primitives to identify whether subsets of a neural network have learned developer-understandable logic. This serves as the basis towards introducing software engineering concepts such as abstractions, modularity, and assertions to the development and understanding of neural network models.


When a Deep Neural Network makes a misprediction, it can be challenging for a developer to understand why. While there are many models for interpretability in terms of predictive features, it may be more natural to isolate a small set of training examples that have the greatest influence on the prediction. However, it is often the case that every training example contributes to a prediction in some way but with varying degrees of responsibility.

Partition Aware Local Models (PALM) is a tool that learns and summarizes this responsibility structure to aide machine learning debugging. PALM approximates a complex model (e.g., a deep neural network) using a two-part surrogate model: a meta-model that partitions the training data, and a set of sub-models that approximate the patterns within each partition. These sub-models can be arbitrarily complex to capture intricate local patterns. However, the meta-model is constrained to be a decision tree. This way the user can examine the structure of the meta-model, determine whether the rules match intuition, and link problematic test examples to responsible training data efficiently. Queries to PALM are nearly 30x faster than nearest neighbor queries for identifying relevant data, which is a key property for interactive applications


Segement-Predict-Explain is a pattern for generating content-specific feedback for users writing text content such as product reviews, housing listings, posts. It uses a novel perturbation-based technique to generate Prescriptive Explanations. This technique uses a quality prediction model and the features of the user’s input text, and assigns responsibility to each feature in proportion to the amount that it will contribute to improving the model’s predicted quality. This can be used to generate feedback to explain why the user’s writing is low quality and specific suggests on how to improve the writing.


  1. AlphaClean: Automatic Generation of Data Cleaning Pipelines
    Sanjay Krishnan, Eugene Wu
  2. DeepBase: Deep Inspection of Neural Networks
    Thibault Sellam, Kevin Lin, Ian Yiran Huang, Michelle Yang, Carl Vondrick, Eugene Wu
    SIGMOD 2019
  3. Deep Neural Inspection Using DeepBase
    Yiru Chen, Yiliang Shi, Boyuan Chen, Thibault Sellam, Carl Vondrick, Eugene Wu
    LearnSys 2018 Workshop at NIPS
  4. CIDR2: Crazier Innovations in Databases JOIN Reinforcement-learning Research
    Eugene Wu
    CIDR 2019 Abstract
  5. Leveraging Quality Prediction Models for Automatic Writing Feedback
    Hamed Nilforoshan, Eugene Wu
    ICWSM 2018
  6. "I Like the Way You Think!" Inspecting the Internal Logic of Recurrent Neural Networks
    Thibault Sellam, Kevin Lin, Ian Yiran Huang, Carl Vondrick, Eugene Wu
    SysML 2018
  7. PALM: Machine Learning Explanations For Iterative Debugging
    Sanjay Krishnan, Eugene Wu
    HILDA 2017
  8. Segment-Predict-Explain for Automatic Writing Feedback
    Hamed Nilforoshan, James Sands, Kevin Lin, Rahul Khanna, Eugene Wu
    Collective Intelligence 2017
  9. Indexing Cost Sensitive Prediction
    Leilani Battle, Edward Benson, Aditya Parameswaran, Eugene Wu
    Technical Report 2016