Robust learning with progressive data expansion against spurious correlation

Jan 1, 2024ยท
Yihe Deng*
Yihe Deng*
,
Yu Yang*
,
Baharan Mirzasoleiman
,
Quanquan Gu
ยท 0 min read
Abstract
In this paper, beyond existing analyses of linear models, we theoretically examine the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features. Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process. In light of this, we propose a new training algorithm called PDE that efficiently enhances the model’s robustness for a better worst-group performance. PDE begins with a group-balanced subset of training data and progressively expands it to facilitate the learning of the core features.
Type
Publication
Advances in neural information processing systems (NeurIPS)