Subarachnoid hemorrhage (SAH) and its major complication, cerebral vasospasm (CVS),
present significant challenges for early diagnosis and risk stratification. In this
study, we developed interpretable decision tree models to differentiate between healthy
controls, SAH patients, and SAH patients with vasospasm using serum N-glycomic data.
Building on previously published glycomic profiles, we introduced a refined modeling
approach combining systematic preprocessing, feature selection, and interpretable
machine learning. Our methodology included outlier removal, standard scaling, and
a novel correlation-based feature reduction guided by feature importance scores derived
from preliminary decision trees. Binary classification tasks (Control vs. SAH and
Control vs. CVS, and SAH vs. CVS) were evaluated through stratified repeated cross-validation
and hyperparameter optimization. Models achieved high accuracy (up to 0.91) and stable
F1-scores across configurations. Key glycans such as FA2(6)G1 (bi-antennary, fucosylated,
monogalactosylated), A4G4S3(2) (tetra-antennary, tetra-galactosylated, tri-sialylated),
and A3G3S3(5) (tri-antennary, tri-galactosylated, tri-sialylated) emerged as the most
discriminative. Visualizations that combine joint feature distributions and decision
boundaries provided intuitive insight into the classifier’s logic. These findings
support the integration of interpretable glycomics-based models into clinical workflows.