Minimal Data Cleaning for Model Training by MinPrep

Published in PVLDB 2026 (to appear), 2026

Recommended citation: Zhen, C., Prayoga, Aryal, N., Termehchy, A., & Aghasi, A. (2026). Minimal Data Cleaning for Model Training by MinPrep. PVLDB, to appear. https://research.engr.oregonstate.edu/idea/sites/research.engr.oregonstate.edu.idea/files/vldb_demo_minprep.pdf

Minimal Data Cleaning for Model Training by MinPrep

PVLDB 2026 - To appear

MinPrep minimal data cleaning workflow

Overview

MinPrep is a system for minimal data cleaning before supervised model training. Instead of asking users to clean all missing or dirty values, it helps identify the cleaning actions that are most relevant to downstream model quality.

The accepted PVLDB 2026 demo paper presents MinPrep as a practical system for targeted data cleaning in supervised learning workflows.

Authors

Cheng Zhen, Prayoga, Nischal Aryal, Arash Termehchy, and Alireza Aghasi.

Read Paper