For instance, when predicting the salary to offer given the descriptions of professional experience, the risk is to capture indirectly a gender bias present in the distribution of salaries. Another example is found in biomedical applications, where for an automated radiology diagnostic system to be useful, it should use more than socio-demographic information to build its prediction.
Here I will talk about confounds in predictive models. I will review classic deconfounding techniques developed in a well-established statistical literature, and how they can be adapted to predictive modeling settings. Departing from deconfounding, I will introduce a non-parametric approach –that we named “confound-isolating cross-validation”– adapting cross-validation experiments to measure the performance of a model independently of the confounding effect.
The examples are mentioned in this work are related to the common issues in neuroimage analysis, although the approach is not limited to neuroscience and can be useful in another domains.
Confounding effects are often present in observational data: the effect or association studied is observed jointly with other effects that are not desired.