Columbia University


The future of industry relies on the ability to make data-driven decisions, however it is only accessible to technical and statistical experts that can program, clean and combine data, visualize large datasets, and debug complex analysis pipelines.

The goal of the WuLab is to dramatically accelerate the democratization of data, and to train high-quality, world-class researchers.


Our research focuses on three areas that are critical bottlenecks to the future of data analysis: data cleaning, creating interactive data exploration and visualization interfaces, and understanding analysis results. The following describes several of our main projects in these areas.

Data Cleaning Data analysis and machine learning are increasingly reliant on the quality of the input data—spurious errors and systematic corruptions can result in misleading and incorrect results. We work on automated data cleaning algorithms that are tailored towards data science applications, as well as crowdsourcing systems for collecting high-quality new data.

Explanation & Interpretation Data analysis is never one-shot – it is an iterative process where analysis results spur new analyses or ways to debug the analysis. We work on data explanation systems that enable analysts to highlight abnomalies in analysis results and explain potential reasons to investigate, as well as machine learning explanation techniques that explain how and what machine learning models (e.g., deep neural networks) learn to make their predictions.

Interactive Data Analysis System The current interface for data analysis is predominantly code. We are studying techniques to improve how to design, architect, and build scalable interactive visual analysis applications. The Data Visualization Management System makes it significantly easier to build and scale interactive data visualization systems. Precision Interfaces extends this technology to automatically generate new visual exploration interfaces tailored to a long tail of data analysis tasks.


We are always looking for hard-working, smart, driven students that are excited pushing forward how humans interact with data. If you are a prospective graduate student or postdoc, read our application document. If you are an undergraduate, masters, or potential intern, please fill out our questionnaire.


Email us at ewu@cs.columbia.edu

Fotis Psallidas Grad Student
Thibault Sellam Postdoc
Xiaolan Wang Collab (UMass)
Yifan Wu Collab (Cal)
Sanjay Krishnan Collab (Cal)
Tejas Dharamsi Masters
Conder Shou
Salim M'jahad
Robbie Netzorg
Kevin Lin
Hamed Nilforoshan Undergrad
Lily-Xiaoxuan Liu
HaoCi Zhang Intern
Mengyang Lyu
Ziyun Wei
Alex Studer
High School
Alumni and Past Collaborators
Daniel Haas Collab (Cal)
Lilong Jiang Collab (OSU)
Daniel Alabi Masters
Zhengjie Miao Masters
Larry Xu Undergrad
James Sands
Naina Sahrawat
Rahul Khanna Undergrad