Deep Clustering with Robust Autoencoder (DCRA)
DOI:
https://doi.org/10.47611/jsrhs.v11i2.2722Keywords:
Machine learning, Deep learning, ClusteringAbstract
Accordingly to Science Daily, 90 percent of all the data in the world has been generated in the last two years. However, the world is analyzing less than 1 percent of its data so far. With the advancement of high-performance computing, deep learning methods are readily applied to analyze large-scale high dimensional datasets. These machine learning methods have achieved significantly efficient training and inferencing as well as producing much more accurate predicted results. Clustering is an unsupervised machine learning method of identifying and grouping similar data points into the same cluster. Clustering plays a fundamental role in the data mining and machine learning community for grouping data into structures so that similar data points are assigned to similar groups. Furthermore, to process these huge amounts of high-dimensional data, deep learning becomes a key technique to learn and perform feature representation of data in latent space for many real world applications. In this paper, we propose deep clustering with robust autoencoder (DCRA), which jointly utilizes robust auto-encoder and deep clustering to perform feature representation and cluster assignments simultaneously. Multiple experiments using open public datasets have been conducted to evaluate our model’s performance. Our results show DCRA is capable of generating high quality clusters with high clustering accuracy of 90% above in high dimensional datasets. The decreasing training and test loss with increasing number of epochs also validates our results.
Downloads
References or Bibliography
https://en.wikipedia.org/wiki/Deep_learning
https://en.wikipedia.org/wiki/Artificial_neural_network
Dor Bank, Noam Koenigstein, Raja Giryes, Autoencoders, https://arxiv.org/abs/2003.05991
https://en.wikipedia.org/wiki/Cluster_analysis
Junyuan Xie, Ross Girshick, Ali Farhadi, Deep Embedding Clustering, ICML'16: Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, June 2016 Pages 478–487
Xifeng Guo, Xinwang Liu, Jianping Yin , Deep Clustering with Convolutional Autoencoders, ICONIP 14 November 2017
Zhihao Zheng, Pengyu Hong, Robust Detection of Adversarial Attacks by Modeling the Intrinsic Properties of Deep Neural Networks, Advances in Neural Information Processing Systems 31 (NeurIPS 2018)
Kui Ren, Tianhang Zheng, Zhan Qin ,Xue Liu, Adversarial Attacks and Defenses in Deep Learning, https://www.sciencedirect.com/science/article/pii/S209580991930503X#!
Iqbal H Sarker , Machine Learning: Algorithms, Real-World Applications and Research Directions, DOI: 10.1007/s42979-021-00592-x
Maryam M Najafabadi, Flavio Villanustre, Taghi M Khoshgoftaar, Naeem Seliya, Randall Wald & Edin Muharemagic , Deep learning applications and challenges in big data analytics, Journal of Big Data volume 2, Article number: 1 (2015)
Jeff Heaton, Applications of Deep Neural Networks, https://arxiv.org/abs/2009.05673
Jung-Hua Wang, Jen-Da Rau and Wen-Jeng Liu, Two-stage clustering via neural networks, IEEE Transactions on Neural Networks 14(3):606-15
Yazhou Ren, Ni Wang, Mingxia Li, Zenglin Xu , Deep Density-based Image Clustering, https://arxiv.org/abs/1812.04287
Jianlong Chang, Lingfeng Wang, Gaofeng Meng, Shiming Xiang, Deep Adaptive Image Clustering, 2017 IEEE International Conference on Computer Vision (ICCV)
Stephan Zheng, Yang Song, Thomas Leung, Ian Goodfellow, Improving the Robustness of Deep Neural Networks via Stability Training, CVPR 2016, https://doi.org/10.48550/arXiv.1604.04326
Tommaso Dreossi, Shromona Ghosh, Alberto Sangiovanni-Vincentelli, Sanjit A. Seshia, A Formalization of Robustness for Deep Neural Networks, https://doi.org/10.48550/arXiv.1903.10033
MNIST, https://en.wikipedia.org/wiki/MNIST_database
FashionMNIST: https://paperswithcode.com/dataset/fashion-mnist
BSD Dataset, https://paperswithcode.com/dataset/bsd
Parsons L, Haque E, Liu H: Subspace Clustering for High Dimensional Data: a Review. SIGKDD Explor Newsl. 2004, 6: 90-105. 10.1145/1007730.1007731.
Jörnsten R, Vardi Y, Zhang CH: A Robust Clustering Method and Visualization Tool Based on Data Depth. 2002, Basel: Birkhäuser
Junyuan Xie, Ross Girshick, Ali Farhadi, Unsupervised deep embedding for clustering analysis, ICML'16: Proceedings of the 33rd International Conference on International Conference on Machine Learning, Volume 48
Published
How to Cite
Issue
Section
Copyright (c) 2022 Connor Lee, Albert Wang; Stefano Rizzo
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.