Minimal Repairs for Learning Over Incomplete Data
Published in NeurIPS Reliable ML from Unreliable Data Workshop, 2025
Recommended citation: Zhen, C., Aryal, N., Termehchy, A., & Biwer, G. (2025). Minimal Repairs for Learning Over Incomplete Data. NeurIPS Workshop on Reliable ML from Unreliable Data. https://openreview.net/pdf?id=ehpewotzMI
Minimal Repairs for Learning Over Incomplete Data
Overview
Incomplete data can make model training unreliable, but repairing every missing value is often wasteful. This work studies minimal repairs: targeted data preparation actions that focus on the missing values that actually affect model accuracy.
The project continues my broader work on learning over incomplete and dirty data, where the goal is to reduce unnecessary human and computational effort while preserving downstream model quality.
Research Themes
- Reliable machine learning from incomplete data.
- Minimal data repairs for downstream model accuracy.
- Human effort reduction in data preparation.
- Data-centric ML pipelines.