Debugging and Interpreting Machine Learning Models

Machine learning models are increasingly used in critial real-world applications such as self-driving cars, loan processing, fake news detection, and more. However these models are highly complex and have a reputation for being “black boxes” – when they make a prediction, it is unclear how the decision was made. Similarly, it is not clear what the model is using to make a prediction and how changes in the data would affect its predictions.

To this end, our lab develops algorithms to interpret complex machine learning models (e.g., deep neural networks, random forests, etc) by identifying training data that affected a prediction, describing what parts of the model are learning, and how user generated inputs can be improved to better help the model.

Deep Neural Inspection

Deep neural networks are revolutionizing many domains and increasingly employed in production and real-world environments. Yet, how do we ensure that learned models behave reliably and as intended? Software engineering principles such as abstraction and modularity help us build and understand reliable systems by principled construction. Yet neural networks are black boxes akin to a block of assembly code.

The Deep Neural Inspection (DNI) project aims to develop software primitives to identify whether subsets of a neural network have learned developer-understandable logic. This serves as the basis towards introducing software engineering concepts such as abstractions, modularity, and assertions to the development and understanding of neural network models.

FACE PALM

When a Deep Neural Network makes a misprediction, it can be challenging for a developer to understand why. While there are many models for interpretability in terms of predictive features, it may be more natural to isolate a small set of training examples that have the greatest influence on the prediction. However, it is often the case that every training example contributes to a prediction in some way but with varying degrees of responsibility.

Partition Aware Local Models (PALM) is a tool that learns and summarizes this responsibility structure to aide machine learning debugging. PALM approximates a complex model (e.g., a deep neural network) using a two-part surrogate model: a meta-model that partitions the training data, and a set of sub-models that approximate the patterns within each partition. These sub-models can be arbitrarily complex to capture intricate local patterns. However, the meta-model is constrained to be a decision tree. This way the user can examine the structure of the meta-model, determine whether the rules match intuition, and link problematic test examples to responsible training data efficiently. Queries to PALM are nearly 30x faster than nearest neighbor queries for identifying relevant data, which is a key property for interactive applications

Segment-Predict-Explain

Segement-Predict-Explain is a pattern for generating content-specific feedback for users writing text content such as product reviews, housing listings, posts. It uses a novel perturbation-based technique to generate Prescriptive Explanations. This technique uses a quality prediction model and the features of the user’s input text, and assigns responsibility to each feature in proportion to the amount that it will contribute to improving the model’s predicted quality. This can be used to generate feedback to explain why the user’s writing is low quality and specific suggests on how to improve the writing.

Publications

  1. Kitana: A Data-as-a-Service Platform
    Zachary Huang, Pranav Subramaniam, Raul Fernandez, Eugene Wu
    In Review 2023
  2. Calibration: A Simple Trick for Fast Interactive Join Analytics
    Zachary Huang, Eugene Wu
    arXiV 2022
  3. How I Stopped Worrying About Training Data Bugs and Started Complaining
    Lampros Flokas, Weiuan Wu, Jiannan Wang, Nakul Verma, Eugene Wu
    DEEM Workshop 2022
  4. A Neural Network Solves and Generates Mathematics Problems by Program Synthesis: Calculus, Differential Equations, Linear Algebra, and More
    Iddo Drori, Sunny Tran, Roman Wang, Newman Cheng, Kevin Liu, Leonard Tang, Elizabeth Ke, Nikhil Singh, Taylor L. Patti, Jayson Lynch, Avi Shporer, Nakul Verma, Eugene Wu, Gilbert Strang
    PNAS 2022 (in review)
  5. Complaint-Driven Training Data Debugging at Interactive Speeds
    Lampros Flokas, Young Wu, Jiannan Wang, Nakul Verma, Eugene Wu
    SIGMOD 2022
  6. Enabling SQL-based training data debugging for federated learning
    Young Wu, Yejia Liu, Lampros Flokas, Jiannan Wang, Eugene Wu
    VLDB 2022
  7. Explaining SQL-ML Queries with Bayesian Optimization
    Brandon Lockhard, Jiannan Wang, Eugene Wu
    VLDB 2021
  8. From Cleaning Before ML to Cleaning For ML
    Felix Neutatz, Binger Chen, Ziawasch Abedjan, Eugene Wu
    Invited, IEEE Data Engineering Bulletin 2021
  9. Complaint-driven Training Data Debugging for Query 2.0
    Young Wu, Lampros Flokas, Jiannan Wang, Eugene Wu
    SIGMOD 2020
  10. Towards Complaint-driven ML Workflow Debugging
    Lampros Flokas, Young Wu, Jiannan Wang, Eugene Wu
    MLOps 2020
  11. AlphaClean: Automatic Generation of Data Cleaning Pipelines
    Sanjay Krishnan, Eugene Wu
    ArXiv 2019
  12. DeepBase: Deep Inspection of Neural Networks
    Thibault Sellam, Kevin Lin, Ian Yiran Huang, Michelle Yang, Carl Vondrick, Eugene Wu
    SIGMOD 2019
  13. Deep Neural Inspection Using DeepBase
    Yiru Chen, Yiliang Shi, Boyuan Chen, Thibault Sellam, Carl Vondrick, Eugene Wu
    LearnSys 2018 Workshop at NIPS
  14. CIDR2: Crazier Innovations in Databases JOIN Reinforcement-learning Research
    Eugene Wu
    CIDR 2019 Abstract
  15. Leveraging Quality Prediction Models for Automatic Writing Feedback
    Hamed Nilforoshan, Eugene Wu
    ICWSM 2018
  16. "I Like the Way You Think!" Inspecting the Internal Logic of Recurrent Neural Networks
    Thibault Sellam, Kevin Lin, Ian Yiran Huang, Carl Vondrick, Eugene Wu
    SysML 2018
  17. PALM: Machine Learning Explanations For Iterative Debugging
    Sanjay Krishnan, Eugene Wu
    HILDA 2017
  18. Segment-Predict-Explain for Automatic Writing Feedback
    Hamed Nilforoshan, James Sands, Kevin Lin, Rahul Khanna, Eugene Wu
    Collective Intelligence 2017
  19. Indexing Cost Sensitive Prediction
    Leilani Battle, Edward Benson, Aditya Parameswaran, Eugene Wu
    Technical Report 2016