Preprint / Version 1

Using Data-Efficient Image Transformers for Diabetic Retinopathy Severity Classification

##article.authors##

  • Veda Fernandes

Keywords:

Image Transformers, Diabetic Retinopathy Severity Classification

Abstract

Roughly 10% of the global adult population is diabetic, diabetes is a metabolic condition which results in chronically high blood sugar levels. Patients with diabetes are at substantially higher risk for several serious health conditions including diabetic retinopathy (DR). DR is a vision-threatening disease which affects 35% of diabetic patients and is projected to affect 160 million people by 2045. Diabetic patients should be screened for retinopathy every one to two years; however, in many countries patients are not regularly screened and therefore not treated. Globally, the lack of rapid and cost-effective screening strategies for DR leads to underdiagnosis and loss of vision. Machine learning tools offer a solution in developing automated models to diagnose DR from eye fundus images. In published literature, convolutional neural networks (CNNs) are the state-of-the-art model for classification of DR. More recently, transformer models have been applied and shown superior performance. Text transformer models have resulted in the proliferation of tools such as ChatGPT, which provide contextual understanding and ability to identify dependencies. In this study, we perform a head-to-head comparison between CNN and vision transform models for classifying DR. We demonstrate that transformer models diagnose DR with a substantially higher accuracy, ranging up to 13% as measured by the F1 performance metric. Furthermore, we identify optimal training parameters for diagnosis of DR, training a total of 19 machine learning models reaching a test set F1 score performance of 90% on a dataset of 35,130 fundus images with 20% of images withheld for independent testing.

Downloads

Posted

10-25-2023