Analyzing the Performance of TabTransformer in Brain Stroke Prediction
DOI:
https://doi.org/10.47611/jsrhs.v12i1.3935Keywords:
Tabular Data Analysis, Machine Learning, Transformer Models, TabTransformer, Electronic Health Records, Brain Stroke PredictionAbstract
The adoption of electronic patient health records has paved the way for machine learning and
deep learning in disease diagnostics and prediction. Though traditionally tree-based algorithms
have performed well on structural data, neural networks are known to perform well on
unstructured data and data with a large number of input features. Furthermore, transformer-
based models such as TabTransformer have been shown to perform competitively with tree-based
algorithms (Huang et al. 2020). In this paper, we compare TabTransformer’s performance with
other state-of-art machine learning algorithms such as XGBoost, RandomForest, DecisionTree, and
feed-forward Multilayer Perceptron. We discovered that TabTransformer shows no significant
improvement over MLP and performs worse in certain metrics. Neither TabTransformer nor MLP
performed better than XGBoost, the best-performing algorithm for brain stroke prediction in
Kaggle competitions.
Downloads
References or Bibliography
Chen, Tianqi, and Carlos Guestrin. 2016. “XGBoost.” In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM. https://doi.org/10.1145/2939672.2939785.
Dev, Soumyabrata, Hewei Wang, Chidozie Shamrock Nwosu, Nishtha Jain, Bharadwaj Veeravalli, and Deepu John. 2022. “A Predictive Analytics Approach for Stroke Prediction Using Machine Learning and Neural Networks.” arXiv. https://doi.org/10.48550/ARXIV.2203.00497.
Huang, Xin, Ashish Khetan, Milan Cvitkovic, and Zohar Karnin. 2020. “TabTransformer: Tabular Data Modeling Using Contextual Embeddings.” arXiv. https://doi.org/10.48550/ARXIV.2012.06678.
Nwosu, Chidozie Shamrock, Soumyabrata Dev, Peru Bhardwaj, Bharadwaj Veeravalli, and Deepu John. 2019. “Predicting Stroke from Electronic Health Records.” arXiv. https://doi.org/10.48550/ARXIV.1904.11280.
Stekhoven, D. J., and P. Buhlmann. 2011. “MissForest–Non-Parametric Missing Value Imputation for Mixed-Type Data.” Bioinformatics 28 (1): 112–18. https://doi.org/10.1093/bioinformatics/btr597.
Published
How to Cite
Issue
Section
Copyright (c) 2023 Hao Ming Xia; Ramin Ramezani
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.