online read us now
Paper details
Number 3 - September 2025
Volume 35 - 2025
Accounting for label shift of positive unlabeled data under selection bias
Jan Mielniczuk, Adam Wawrzeńczyk
Abstract
We consider the scenario when two samples of positive unlabeled (PU) data are available and for the second sample
the change in prior probability of classes occurs while distributions of predictors in classes remain the same (label shift
setting). The selection of positive elements may be object-dependent. We study the properties of the underlying probabilistic
structure under the novel augmented PU scenario, proving in particular that label shift occurs also for unlabeled populations.
We introduce and investigate an estimator of prior probability for label-shifted population. Furthermore, in this case we
construct and analyze behavior of Bayes classifier in this setting. It turns out to be a Bayes classifier for the unlabeled class
with a modified threshold. This gives rise to its three empirical counterparts which are compared on benchmark data sets.
Keywords
positive unlabeled learning, label shift, augmented positive unlabeled data, selection bias, Bayes classifier, accuracy.