Minimal Repairs for Learning Over Incomplete Data

Published in NeurIPS Reliable ML from Unreliable Data Workshop, 2025

Recommended citation: Zhen, C., Aryal, N., Termehchy, A., & Biwer, G. (2025). Minimal Repairs for Learning Over Incomplete Data. NeurIPS Workshop on Reliable ML from Unreliable Data. https://openreview.net/pdf?id=ehpewotzMI

Minimal Repairs for Learning Over Incomplete Data

NeurIPS Workshop 2025 - Reliable ML from Unreliable Data

Minimal repairs for incomplete data diagram

Overview

Incomplete data can make model training unreliable, but repairing every missing value is often wasteful. This work studies minimal repairs: targeted data preparation actions that focus on the missing values that actually affect model accuracy.

The project continues my broader work on learning over incomplete and dirty data, where the goal is to reduce unnecessary human and computational effort while preserving downstream model quality.

Research Themes

  • Reliable machine learning from incomplete data.
  • Minimal data repairs for downstream model accuracy.
  • Human effort reduction in data preparation.
  • Data-centric ML pipelines.
Read Paper