IYRC Journal
  • Information
    • Editorial Board
    • Reviewer Board
  • Articles
    • 2024
    • 2025 - 1st Issue
    • 2025 - 2nd Issue
  • Guide for authors
  • SUBMISSION
    • Submission system
  • Become a reviewer

Idiosyncratic Feature Overfitting of EEG data by Deep Learning​​

Danielle Steinbach
The Harker School, San Jose, USA
​​​Publication date: November 20, 2025
​DOI: http://doi.org/10.34614/JIYRC2025II56
ABSTRACT 
In medical deep learning (DL) data taken from the same patient must never crossover between the training and test sets. However, this practice is not implemented by some papers in the EEG-DL literature, resulting in reported high-accuracy disease classification that will likely not generalize. We demonstrate this by training a DL model on Parkinson’s Disease EEG data with and without patient crossover, showing a  >99% F1-score for binary classification that drops to a ~60% F1-score when no patient crossover in the train/test sets is implemented. We hypothesize that this effect (sometimes referred to as “data leakage”) becomes pronounced because the EEG model can learn specific idiosyncratic features per patient. To confirm this hypothesis, we show our base model can be trained to predict patient-ID with ~99% classification accuracy, and use t-SNE cluster visualization showing it segregates data instances clearly by patient and not a common set of pathology-related features.

PAPER
Download PDF
Picture
THE INTERNATIONAL YOUNG
RESEARCHERS' CONFERENCE


Columbia University Vagelos College of Physicians and Surgeons
104 Haven Ave, New York, NY 10032

一般社団法人 IYRC
   〒106-0032 東京都港区六本木7丁目2番28-605号
    電話番号: 03-3527-9323

    GET IN TOUCH

Submit
  • Information
    • Editorial Board
    • Reviewer Board
  • Articles
    • 2024
    • 2025 - 1st Issue
    • 2025 - 2nd Issue
  • Guide for authors
  • SUBMISSION
    • Submission system
  • Become a reviewer