Idiosyncratic Feature Overfitting of EEG data by Deep Learning
Danielle Steinbach
The Harker School, San Jose, USA
Publication date: November 20, 2025
The Harker School, San Jose, USA
Publication date: November 20, 2025
DOI: http://doi.org/10.34614/JIYRC2025II56
ABSTRACT
In medical deep learning (DL) data taken from the same patient must never crossover between the training and test sets. However, this practice is not implemented by some papers in the EEG-DL literature, resulting in reported high-accuracy disease classification that will likely not generalize. We demonstrate this by training a DL model on Parkinson’s Disease EEG data with and without patient crossover, showing a >99% F1-score for binary classification that drops to a ~60% F1-score when no patient crossover in the train/test sets is implemented. We hypothesize that this effect (sometimes referred to as “data leakage”) becomes pronounced because the EEG model can learn specific idiosyncratic features per patient. To confirm this hypothesis, we show our base model can be trained to predict patient-ID with ~99% classification accuracy, and use t-SNE cluster visualization showing it segregates data instances clearly by patient and not a common set of pathology-related features.
In medical deep learning (DL) data taken from the same patient must never crossover between the training and test sets. However, this practice is not implemented by some papers in the EEG-DL literature, resulting in reported high-accuracy disease classification that will likely not generalize. We demonstrate this by training a DL model on Parkinson’s Disease EEG data with and without patient crossover, showing a >99% F1-score for binary classification that drops to a ~60% F1-score when no patient crossover in the train/test sets is implemented. We hypothesize that this effect (sometimes referred to as “data leakage”) becomes pronounced because the EEG model can learn specific idiosyncratic features per patient. To confirm this hypothesis, we show our base model can be trained to predict patient-ID with ~99% classification accuracy, and use t-SNE cluster visualization showing it segregates data instances clearly by patient and not a common set of pathology-related features.