Denoising AutoEncoder-based Representation Learning for Multi-Task Whole Slide Image Analysis
DOI:
https://doi.org/10.47611/jsrhs.v13i1.6140Keywords:
Pathology, Convolutional Neural Network, Representation LearningAbstract
Along with the advancement of artificial intelligence, there have been significant improvements in the field of whole slide images (WSI). WSI in machine learning is mainly utilized for pathological analysis, consisting of diverse tasks such as classification of normal versus tumor patches, segmentation of precise areas of potential tumor, or object detection indicating tumor sites. However, training these distinct models for each individual task is both time intensive and inefficient. Therefore, there is a high demand for developing a unified learning algorithm capable of concurrently handling multiple WSI tasks. To address the aforementioned problem, a representation learning based transfer learning method is proposed to process multiple downstream tasks including classification, segmentation, and object detection. Synthesizing two stages, the proposed method utilizes the reconstruction of images from representation learning and pretrained parameters from transfer learning method to create a more accurate and time-efficient model for analyzing WSI. Overall, the proposed method offers a better representation of WSI, which leads to enhanced accuracy in analysis and interpretation. Through extensive experiments, I have found that the proposed method outperforms previous state-of-the-art networks in various downstream tasks including classification, segmentation, and object detection. I expect the proposed method to be applied in real world scenarios with increased practicality and accuracy.
Downloads
References or Bibliography
Athanazio, D. A., Amorim, L. S., da Cunha, I. W., Leite, K. R. M., da Paz, A. R., de Paula Xavier Gomes, R., ... & Bezerra, S. M. (2021). Classification of renal cell tumors–current concepts and use of ancillary tests: recommendations of the Brazilian Society of Pathology. Surgical and Experimental Pathology, 4(1), 1-21.
Chen, L. C., Papandreou, G., Schroff, F., & Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587. https://doi.org/10.48550/arXiv.1706.05587
Halicek, M., Shahedi, M., Little, J. V., Chen, A. Y., Myers, L. L., Sumer, B. D., & Fei, B. (2019). Head and neck cancer detection in digitized whole-slide histology using convolutional neural networks. Scientific reports, 9(1), 14043.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778). https://doi.org/10.48550/arXiv.1512.03385
Kim, H., Yoon, H., Thakur, N., Hwang, G., Lee, E. J., Kim, C., & Chong, Y. (2021). Deep learning-based histopathological segmentation for whole slide images of colorectal cancer in a compressed domain. Scientific reports, 11(1), 22520.
Lin, G., Milan, A., Shen, C., & Reid, I. (2017). Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1925-1934). https://doi.org/10.48550/arXiv.1611.06612
Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980-2988). https://doi.org/10.48550/arXiv.1708.02002
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14 (pp. 21-37). Springer International Publishing.
https://doi.org/10.48550/arXiv.1512.02325
Liu, Y., Gadepalli, K., Norouzi, M., Dahl, G. E., Kohlberger, T., Boyko, A., ... & Stumpe, M. C. (2017). Detecting cancer metastases on gigapixel pathology images. arXiv preprint arXiv:1703.02442.
NVIDIA. (2023, Jul 13), “Whole Slide Image Analysis in Real Time with MONAI and RAPIDS”: NVIDIA.
https://developer.nvidia.com/blog/whole-slide-image-analysis-in-real-time-with-monai-and-rapids/
R. Chandradevan, D. Chittajallu, L. Cooper, D. Gutman, M. McCormick, and A. Enquobahrie (2019, June 25). “Cell Nuclei Detection on Whole-Slide Histopathology Images Using HistomicsTK and Faster R-CNN Deep Learning Models”: Kitware.
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788). https://doi.org/10.48550/arXiv.1506.02640
Redmon, J., & Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767. https://doi.org/10.48550/arXiv.1804.02767
Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28. https://doi.org/10.48550/arXiv.1506.01497
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. https://doi.org/10.48550/arXiv.1409.1556
Tan, M., & Le, Q. (2019, May). Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning (pp. 6105-6114). PMLR. https://doi.org/10.48550/arXiv.1905.11946
Tan, M., Pang, R., & Le, Q. V. (2020). Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10781-10790). https://doi.org/10.48550/arXiv.1911.09070
Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., ... & Xiao, B. (2020). Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence, 43(10), 3349-3364. https://doi.org/10.48550/arXiv.1908.07919
Wang, J., Xu, Z., Pang, Z. F., Huo, Z., & Luo, J. (2021). Tumor detection for whole slide image of liver based on patch-based convolutional neural network. Multimedia Tools and Applications, 80, 17429-17440.
Wu, Z., Shen, C., & Van Den Hengel, A. (2019). Wider or deeper: Revisiting the resnet model for visual recognition. Pattern Recognition, 90, 119-133. https://doi.org/10.48550/arXiv.1611.10080
Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017). Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2881-2890). https://doi.org/10.48550/arXiv.1612.01105
Published
How to Cite
Issue
Section
Copyright (c) 2024 Jooha Lee; Sherrie Lah
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.