Denoising AutoEncoder-based Representation Learning for Multi-Task Whole Slide Image Analysis

Authors

  • Jooha Lee Crean Lutheran High School
  • Sherrie Lah Crean Lutheran High School

DOI:

https://doi.org/10.47611/jsrhs.v13i1.6140

Keywords:

Pathology, Convolutional Neural Network, Representation Learning

Abstract

Along with the advancement of artificial intelligence, there have been significant improvements in the field of whole slide images (WSI). WSI in machine learning is mainly utilized for pathological analysis, consisting of diverse tasks such as classification of normal versus tumor patches, segmentation of precise areas of potential tumor, or object detection indicating tumor sites. However, training these distinct models for each individual task is both time intensive and inefficient. Therefore, there is a high demand for developing a unified learning algorithm capable of concurrently handling multiple WSI tasks. To address the aforementioned problem, a representation learning based transfer learning method is proposed to process multiple downstream tasks including classification, segmentation, and object detection. Synthesizing two stages, the proposed method utilizes the reconstruction of images from representation learning and pretrained parameters from transfer learning method to create a more accurate and time-efficient model for analyzing WSI. Overall, the proposed method offers a better representation of WSI, which leads to enhanced accuracy in analysis and interpretation. Through extensive experiments, I have found that the proposed method outperforms previous state-of-the-art networks in various downstream tasks including classification, segmentation, and object detection. I expect the proposed method to be applied in real world scenarios with increased practicality and accuracy.

Downloads

Download data is not yet available.

References or Bibliography

Athanazio, D. A., Amorim, L. S., da Cunha, I. W., Leite, K. R. M., da Paz, A. R., de Paula Xavier Gomes, R., ... & Bezerra, S. M. (2021). Classification of renal cell tumors–current concepts and use of ancillary tests: recommendations of the Brazilian Society of Pathology. Surgical and Experimental Pathology, 4(1), 1-21.

Chen, L. C., Papandreou, G., Schroff, F., & Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587. https://doi.org/10.48550/arXiv.1706.05587

Halicek, M., Shahedi, M., Little, J. V., Chen, A. Y., Myers, L. L., Sumer, B. D., & Fei, B. (2019). Head and neck cancer detection in digitized whole-slide histology using convolutional neural networks. Scientific reports, 9(1), 14043.

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778). https://doi.org/10.48550/arXiv.1512.03385

Kim, H., Yoon, H., Thakur, N., Hwang, G., Lee, E. J., Kim, C., & Chong, Y. (2021). Deep learning-based histopathological segmentation for whole slide images of colorectal cancer in a compressed domain. Scientific reports, 11(1), 22520.

Lin, G., Milan, A., Shen, C., & Reid, I. (2017). Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1925-1934). https://doi.org/10.48550/arXiv.1611.06612

Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980-2988). https://doi.org/10.48550/arXiv.1708.02002

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14 (pp. 21-37). Springer International Publishing.

https://doi.org/10.48550/arXiv.1512.02325

Liu, Y., Gadepalli, K., Norouzi, M., Dahl, G. E., Kohlberger, T., Boyko, A., ... & Stumpe, M. C. (2017). Detecting cancer metastases on gigapixel pathology images. arXiv preprint arXiv:1703.02442.

NVIDIA. (2023, Jul 13), “Whole Slide Image Analysis in Real Time with MONAI and RAPIDS”: NVIDIA.

https://developer.nvidia.com/blog/whole-slide-image-analysis-in-real-time-with-monai-and-rapids/

R. Chandradevan, D. Chittajallu, L. Cooper, D. Gutman, M. McCormick, and A. Enquobahrie (2019, June 25). “Cell Nuclei Detection on Whole-Slide Histopathology Images Using HistomicsTK and Faster R-CNN Deep Learning Models”: Kitware.

https://www.kitware.com/cell-nuclei-detection-on-whole-slide-histopathology-images-using-histomicstk-and-faster-r-cnn-deep-learning-models

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788). https://doi.org/10.48550/arXiv.1506.02640

Redmon, J., & Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767. https://doi.org/10.48550/arXiv.1804.02767

Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28. https://doi.org/10.48550/arXiv.1506.01497

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. https://doi.org/10.48550/arXiv.1409.1556

Tan, M., & Le, Q. (2019, May). Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning (pp. 6105-6114). PMLR. https://doi.org/10.48550/arXiv.1905.11946

Tan, M., Pang, R., & Le, Q. V. (2020). Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10781-10790). https://doi.org/10.48550/arXiv.1911.09070

Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., ... & Xiao, B. (2020). Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence, 43(10), 3349-3364. https://doi.org/10.48550/arXiv.1908.07919

Wang, J., Xu, Z., Pang, Z. F., Huo, Z., & Luo, J. (2021). Tumor detection for whole slide image of liver based on patch-based convolutional neural network. Multimedia Tools and Applications, 80, 17429-17440.

Wu, Z., Shen, C., & Van Den Hengel, A. (2019). Wider or deeper: Revisiting the resnet model for visual recognition. Pattern Recognition, 90, 119-133. https://doi.org/10.48550/arXiv.1611.10080

Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017). Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2881-2890). https://doi.org/10.48550/arXiv.1612.01105

Published

02-29-2024

How to Cite

Lee, J., & Lah, S. (2024). Denoising AutoEncoder-based Representation Learning for Multi-Task Whole Slide Image Analysis. Journal of Student Research, 13(1). https://doi.org/10.47611/jsrhs.v13i1.6140

Issue

Section

HS Research Articles