Interpretable Skin Cancer Diagnosis with Contrastive Language-Image Pre-training
DOI:
https://doi.org/10.47611/jsrhs.v13i3.7262Keywords:
Melanoma, Machine-learning, Skin cancer, Contrastive Language-Image Pre-training, Dermatology, Skin lesion, AIAbstract
Recent advances in machine learning and computer vision have significantly improved the performance of skin cancer diagnostic models. However, their lack of interpretability poses a challenge for clinical adoption, as physicians may find it difficult to trust a diagnosis made by a “black box” system. We propose a novel methodology for skin cancer diagnosis using Contrastive Language-Image Pretraining (CLIP), allowing physicians to provide a set of features in natural language and then determine the weight our model gave each feature in its diagnosis. This approach aims to bridge the communication gap between physicians and machine learning models. We show that the CLIP model is able to diagnose skin cancer in a zero-shot setting and provide insight into how each provided feature contributes to its diagnosis.
Downloads
References or Bibliography
World Health Organization (WHO), Radiation: Ultraviolet (UV) radiation and skin cancer, Accessed: 2024-5-20.
https://www.who.int/news-room/q-a-detail/radiation-ultraviolet-(uv)-radiation-and-skin-cancer.
Jerant, A. F., Johnson, J. T., Sheridan, C. D., & Caffrey, T. J. (2000). Early detection and treatment of skin cancer. American family physician, 62(2), 357–382.
https://pubmed.ncbi.nlm.nih.gov/10929700/
Celebi, M. E., Kingravi, H. A., Uddin, B., Iyatomi, H., Aslandogan, Y. A., Stoecker, W. V., & Moss, R. H. (2007). A methodological approach to the classification of dermoscopy images. Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society, 31(6), 362–373.
https://doi.org/10.1016/j.compmedimag.2007.01.003
Ganster, H., Pinz, P., Rohrer, R., Wildling, E., Binder, M., & Kittler, H. (2001). Automated melanoma recognition. IEEE Transactions on Medical Imaging, 20(3), 233-239.
https://doi.org/10.1109/42.918473
Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115–118.
https://doi.org/10.1038/nature21056
Mahmud, F., Mahfiz, M. M., Kabir, M. Z. I., & Abdullah, Y. (2023). An Interpretable Deep Learning Approach for Skin Cancer Categorization. arXiv preprint arXiv:2312.10696.
https://doi.org/10.48550/arXiv.2312.10696
Mridha, K., Uddin, M., Shin, J., Khadka, S., & Mridha, M. (2023). An Interpretable Skin Cancer Classification Using Optimized Convolutional Neural Network for a Smart Healthcare System. IEEE Access, 11, 41003-41018.
https://doi.org/10.1109/ACCESS.2023.3269694
Alfi, I. A., Rahman, M. M., Shorfuzzaman, M., & Nazir, A. (2022). A non-invasive interpretable diagnosis of melanoma skin cancer using deep learning and ensemble stacking of machine learning models. Diagnostics, 12(3), 726.
https://www.mdpi.com/2075-4418/12/3/726
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., ... & Sutskever, I. (2021, July). Learning transferable visual models from natural language supervision. In International conference on machine learning (pp. 8748-8763). PMLR.
https://doi.org/10.48550/arXiv.2103.00020
Zhao, Z., Liu, Y., Wu, H., Li, Y., Wang, S., Teng, L., ... & Shen, D. (2023). Clip in medical imaging: A comprehensive survey. arXiv preprint arXiv:2312.07353.
https://doi.org/10.48550/arXiv.2312.07353
Li, X., Wu, J., Chen, E. Z., & Jiang, H. (2019, July). From deep learning towards finding skin lesion biomarkers. In 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (pp. 2797-2800). IEEE.
https://doi.org/10.1109/EMBC.2019.8857334
Barata, C., Celebi, M. E., & Marques, J. S. (2021). Explainable skin lesion diagnosis using taxonomies. Pattern Recognition, 110, 107413.
https://doi.org/10.1016/j.patcog.2020.107413
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
https://doi.org/10.48550/arXiv.1512.03385
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., ... & Houlsby, N. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
https://doi.org/10.48550/arXiv.2010.11929
Devlin, J., Chang, M.W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 4171–4186). Association for Computational Linguistics.
https://doi.org/10.18653/v1/N19-1423
Tsao, H., Olazagasti, J. M., Cordoro, K. M., Brewer, J. D., Taylor, S. C., Bordeaux, J. S., ... & Begolka, W. S. (2015). Early detection of melanoma: reviewing the ABCDEs. Journal of the American Academy of Dermatology, 72(4), 717-723.
https://doi.org/10.1016/j.jaad.2015.01.025
Tan, M., & Le, Q. (2019, May). Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning (pp. 6105-6114). PMLR.
https://doi.org/10.48550/arXiv.1905.11946
Fanconi, C. (2019) Skin Cancer: Malignant vs. Benign, Version 1. Retrieved March 20, 2024 from: https://www.kaggle.com/datasets/fanconic/skin-cancer-malignant-vs-benign/data
Published
How to Cite
Issue
Section
Copyright (c) 2024 Genebelle Mynn; Ms. Greenberg

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.