Selected Publications (Show All)

  1. JoinBoost: Grow Trees Over Normalized Data Using Only SQL
    Zezhou Huang, Rathijit Sen, Jiaxiang Liu, Eugene Wu
    VLDB 2023
  2. Saibot: A Differentially Private Data Search Platform
    Zezhou Huang, Jiaxiang Liu, Daniel Alabi, Raul Castro Fernandez, Eugene Wu
    VLDB 2023
  3. GAMUT: Matrix Multiplication-Like Tasks on GPUs
    Xincheng Xie, Junyoung Kim, Kenneth Ross
    ADMS 2023
  4. Amulet: Adaptive matrix-multiplication-like tasks
    Junyoung Kim, Kenneth A Ross, Eric Sedlar, Lukas Stadler
    DaMoN 2023
  5. Interactive Interface Generation in Notebooks
    Jeffrey Tao, Yiru Chen, Eugene Wu
    SIGMOD (demo) 2022
  6. PI2: Generating Visual Analysis Interfaces From Queries
    Yiru Chen, Eugene Wu
    SIGMOD 2022
  7. Reptile: Aggregation-level Explanations for Hierarchical Data
    Zachary Huang, Eugene Wu
    SIGMOD 2022
  8. Enabling SQL-based training data debugging for federated learning
    Young Wu, Yejia Liu, Lampros Flokas, Jiannan Wang, Eugene Wu
    VLDB 2022
  9. Complaint-Driven Training Data Debugging at Interactive Speeds
    Lampros Flokas, Young Wu, Jiannan Wang, Nakul Verma, Eugene Wu
    SIGMOD 2022
  10. Adaptive Code Generation for Data-Intensive Analytics
    Wangda Zhang, Junyoung Kim, Kenneth A. Ross, Eric Sedlar, Lucas Stadler
    VLDB 2021
  11. Quantifying the effects of COVID-19 on restaurant reviews
    Ivy Cao, Zizhou Liu, Giannis Karamanolakis, Daniel Hsu, Luis Gravano
    SocialNLP 2021
  12. Physical Visualization Design
    Lana Ramjit, Zhaoning Kong, Ravi Netravali, Eugene Wu
    SIGMOD (demo) 2020
  13. VIP: A SIMD Vectorized Analytical Query Engine
    Orestis Polychroniou, Kenneth A. Ross
    VLDB Journal 2020
  14. Parallel Prefix Sum with SIMD
    Wangda Zhang, Yanbin Wang, Kenneth A. Ross
    ADMS 2020
  15. Permutation Index: Exploiting Data Skew for Improved Query Performance
    Wangda Zhang, Kenneth A. Ross
    ICDE 2020
  16. Exploiting Data Skew for Improved Query Performance
    Wangda Zhang, Kenneth A. Ross
    IEEE TKDE 2020
  17. Efficient Search over Genomic Short Read Data
    Wangda Zhang, Mengdi Lin, Kenneth A. Ross
    SSDBM 2020
  18. Towards Complaint-driven ML Workflow Debugging
    Lampros Flokas, Young Wu, Jiannan Wang, Eugene Wu
    MLOps 2020
  19. Monte Carlo Tree Search for Generating Interactive Data Analysis Interfaces
    Yiru Chen, Eugene Wu
    Intelligent Process Automation (IPA) 2020
  20. Towards Practical Vectorized Analytical Query Engines
    Orestis Polychroniou, Kenneth A. Ross
    DaMoN 2019
  21. Master of None Acceleration: A Comparison of Accelerator Architectures for Analytical Query Processing
    Andrea Lottarini, João Pedro Cerqueira, Thomas J. Repetti, Stephen A. Edwards, Kenneth A. Ross, Mingoo Seok, Martha A. Kim
    ISCA 2019
  22. Precision Interfaces
    Qianrui Zhang, Haoci Zhang, Viraj Rai, Thibault Sellam, Eugene Wu
    SIGMOD 2019
  23. DeepBase: Deep Inspection of Neural Networks
    Thibault Sellam, Kevin Lin, Ian Yiran Huang, Michelle Yang, Carl Vondrick, Eugene Wu
    SIGMOD 2019
  24. Distributed Joins and Data Placement for Minimal Network Traffic
    Orestis Polychroniou, Wangda Zhang, Kenneth A. Ross
    TODS 2018
  25. Ten Years of Web Tables
    Michael Cafarella, Alon Halevy, Daisy Zhe Wang, Hongrae Lee, Jayant Madhavan, Cong Yu, Eugene Wu,
    PVLDB 2018 Invited Paper,
  26. At a Glance: Approximate Entropy as a Measure of Line Chart Visualization Complexity
    Gabriel Ryan, Abigail Mosca, Remco Chang, Eugene Wu
    InfoVIS 2018
  27. Provenance in Interactive Visualizations
    Fotis Psallidas, Eugene Wu
    HILDA 2018
  28. Leveraging Quality Prediction Models for Automatic Writing Feedback
    Hamed Nilforoshan, Eugene Wu
    ICWSM 2018
  29. Precision Interfaces for Different Modalities
    HaoCi Zhang, Viraj Rai, Thibault Sellam, Eugene Wu
    SIGMOD (demo) 2018
  30. “I Like the Way You Think!” Inspecting the Internal Logic of Recurrent Neural Networks
    Thibault Sellam, Kevin Lin, Ian Yiran Huang, Carl Vondrick, Eugene Wu
    SysML 2018
  31. Smoke: Fine-grained Lineage at Interactive Speeds
    Fotis Psallidas, Eugene Wu
    VLDB 2018
  32. BoostClean: Automated Error Detection and Repair for Machine Learning
    Sanjay Krishnan, Michael J. Franklin, Ken Goldberg, Eugene Wu
    Tech Report 2017
  33. Network Synthesis for Database Processing Units
    Andrea Lottarini, Stephen A. Edwards, Kenneth A. Ross, Martha A. Kim
    DAC 2017
  34. Deadlock-free joins in DB-mesh, an asynchronous systolic array accelerator
    Bingyi Cao, Kenneth A. Ross, Stephen A. Edwards, Martha A. Kim
    DAMON 2017
  35. Combining Design and Performance in a Data Visualization Management System
    Eugene Wu, Fotis Psallidas, Zhengjie Miao, Haoci Zhang,Laura Rettig, Yifan Wu, Thibault Sellam
    CIDR 2017
  36. A DeVIL-ish Approach to Inconsistency in Interactive Visualizations
    Yifan Wu, Joe Hellerstein, Eugene Wu
    Hilda 2016
  37. PFunk-H: Approximate Query Processing using Perceptual Models
    Daniel Alabi, Eugene Wu
    Hilda 2016
  38. Towards Reliable Interactive Data Cleaning: A User Survey and Recommendations
    Sanjay Krishnan, Daniel Haas, Michael J. Franklin, Eugene Wu
    Hilda 2016
  39. ActiveClean: An Interactive Data Cleaning Framework For Modern Machine Learning
    Sanjay Krishnan, Michael Franklin, Ken Goldberg, Jiannan Wang, Eugene Wu
    SIGMOD 2016 Demo
  40. SIMD-accelerated regular expression matching
    E. A. Sitaridi, O. Polychroniou, K. A. Ross
    DAMON 2016
  41. k-Shape: Efficient and Accurate Clustering of Time Series
    J. Paparrizos and L. Gravano
    SIGMOD Record 2016
  42. Detecting Devastating Diseases in Search Logs
    J. Paparrizos, R. W. White, and E. Horvitz
    SIGKDD 2016
  43. Screening for Pancreatic Adenocarcinoma Using Signals From Web Search Logs: Feasibility Study and Results
    J. Paparrizos, R. W. White, and E. Horvitz
    Journal of Oncology Practice
  44. CLAMShell: Speeding up Crowds for Low-latency Data Labeling
    D. Haas, J. Wang, E. Wu, and M J. Franklin
    VLDB 2016
  45. Massively-Parallel Lossless Data Decompression
    Evangelia A. Sitaridi, RenŽ MŸller, Tim Kaldewey, Guy M. Lohman, Kenneth A. Ross
    ICPP 2016
  46. A Course on Programming and Problem Solving
    S. Sheth, C. Murphy, K. A. Ross, D. E. Shasha
    SIGCSE 2016
  47. GPU-accelerated string matching for database applications
    E. Sitaridi and K. A. Ross
    VLDB Journal 2016
  48. Exploiting SSDs in operational multiversion databases
    M. Sadoghi, K. A. Ross, M. Canim, B. Bhattacharjee
    VLDB Journal 2016
  49. Towards Perception-aware Interactive Data Visualization Systems
    E. Wu and A. Nandi
    DSIA 2015
  50. SampleClean: Fast and Reliable Analytics on Dirty Data
    S. Krishnan, J. Wang, M. J. Franklin, K. Goldberg, T. Kraska, T. Milo, and E. Wu
    Overview paper
  51. The Q100 Database Processing Unit
    L. Wu, A. Lottarini, T. K. Paine, M. A. Kim, K. A. Ross
    IEEE MICRO 2015
  52. Efficient Lightweight Compression Alongside Fast Scans
    O. Polychroniou and K. A. Ross
    DAMON 2015
  53. Implementing Latency-Insensitive Dataflow Blocks
    B. Cao, K. A. Ross, M. A. Kim, and S. A. Edwards
    MEMOCODE 2015
  54. Wisteria: Nurturing Scalable Data Cleaning Infrastructure (Demo)
    D. Haas, S. Krishnan, J. Wang, M. J. Franklin, and E. Wu
    VLDB 2015
  55. Collaborative Data Analytics with Datahub (Demo)
    A. Bhardwaj, A. Deshpande, A. Elmore, D. Karger, S. Madden, A. Parameswaran, H. Subramanyam, E. Wu, and R. Zhang
    VLDB 2015
  56. Ranking Deep Web Text Collections for Scalable Information Extraction
    P. Barrio, L. Gravano, and C. Develder
    CIKM 2015
  57. k-Shape: Efficient and Accurate Clustering of Time Series
    J. Paparrizos and L. Gravano
    SIGMOD 2015
  58. Learning to Rank Adaptively for Scalable Information Extraction
    P. Barrio, G. Sim›es, H. Galhardas, and L. Gravano
    EDBT 2015
  59. Rethinking SIMD Vectorization for In-Memory Databases
    O. Polychroniou, A. Raghavan, K. A. Ross
    SIGMOD 2015
  60. The Case for Data Visualization Management Systems
    E. Wu, L. Battle, and S. Madden
    VLDB 2014
  61. Hardware Partitioning for Big Data Analytics
    L. Wu, R. J. Barker, M. A. Kim, K. A. Ross:
    IEEE MICRO 2014
  62. Reducing Database Locking Contention Through Multi-version Concurrency
    M. Sadoghi, M. Canim, B. Bhattacharjee, F. Nagel, K. A. Ross
    PVLDB 2014
  63. Energy Analysis of Hardware and Software Range Partitioning
    L. Wu, O. Polychroniou, R. J. Barker, M. A. Kim, and K. A. Ross
    TOCS 2014
  64. Coherent Somatic Mutation in Autoimmune Disease
    K. A. Ross
    PLoS One 2014
  65. Vectorized Bloom Filters for Advanced SIMD Processors
    O. Polychroniou and K. A. Ross
    DAMON 2014
  66. Q100: The Architecture and Design of a Database Processing Unit
    L. Wu, A. Lottarini, T. K. Paine, M. A. Kim, and K. A. Ross
    ASPLOS 2014
  67. A Comprehensive Study of Main-memory Partitioning and its Application to Large-scale Comparison- and Radix-sort
    O. Polychroniou and K. A. Ross
    SIGMOD 2014
  68. Track Join: Distributed Joins with Minimal Network Traffic
    O. Polychroniou, R. Sen, and K. A. Ross
    SIGMOD 2014
  69. Detecting Foodborne Disease Outbreaks Using Social Media (demonstration)
    F. Psallidas, L. Gravano, and many others
    NYC Media Lab's Annual Summit, 2014
  70. Information Extraction from Social Media for Public Health
    N. Elhadad, L. Gravano, D. Hsu, S. Balter, V. Reddy, and H. Waechter
    KDD at Bloomberg Workshop, Data Frameworks Track (KDD 2014), 2014
  71. REEL: A Relation Extraction Learning Framework (poster)
    P. Barrio, G. Sim›es, H. Galhardas, and L. Gravano
    JCDL 2014
  72. Using Online Reviews by Restaurant Patrons to Identify Unreported Cases of Foodborne Illness Ñ New York City, 2012Ð2013
    C. Harrison, M. Jorder, H. Stern, F. Stavinsky, V. Reddy, H. Hanson, H. Waechter, L. Lowe, L. Gravano, and S. Balter
    Centers for Disease Control and Prevention Morbidity and Mortality Weekly Report 2014