Depth vs. Complexity: A Comparative Study of Neural Network Architectures in Image Classification

Mihir Kulgod

doi:10.47611/jsrhs.v13i3.7237

Authors

Mihir Kulgod The Shri Ram School Aravali

DOI:

https://doi.org/10.47611/jsrhs.v13i3.7237

Keywords:

Artificial Intelligence, Neural Networks, Machine Learning, Image Classification, Image Recognition, Recognition Algorithm

PDF

Abstract

There is a growing requirement for image classification algorithms in a plethora of fields, including medical imaging, autonomous vehicles, surveillance, etc. To streamline the process of designing such algorithms to accomplish such a task, one must be aware of the strengths and drawbacks of existing models. This paper investigates the performance of various image classification algorithms, focusing on the dynamic between model depth and complexity, and their effect on accuracy. This study utilizes three datasets - MNIST, Fashion MNIST, and CIFAR10 - to conduct a comprehensive analysis of six distinct image classification architectures. There is a discernible accuracy gradient as one traverses model complexities, from the standard Multilayer Perceptrons (MLPs) to a Visual Transformer (ViT). Training a ViT requires large amounts of computational resources, yet the investment is justified by the remarkable accuracy it achieves. However, it is always more efficient to use a model that fits the scale of the data. No model is the best for every dataset, and data complexity plays a vital role in determining the optimal model architecture for any data.

Downloads

Download data is not yet available.

References or Bibliography

Krizhevsky, A. (2009). Learning Multiple Layers of Features from Tiny Images. https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf

LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324. https://doi.org/10.1109/5.726791

Simonyan, K., & Zisserman, A. (2015). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv (Cornell University). https://arxiv.org/pdf/1409.1556

He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep Residual Learning for Image Recognition. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1512.03385

Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2016). Densely Connected Convolutional Networks. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1608.06993

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv (Cornell University). https://arxiv.org/pdf/2010.11929

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). SciKit-Learn: Machine Learning in Python. HAL (Le Centre Pour La Communication Scientifique Directe). https://hal.inria.fr/hal-00650905

Depth vs. Complexity: A Comparative Study of Neural Network Architectures in Image Classification

Authors

DOI:

Keywords:

Abstract

Downloads

References or Bibliography

Published

How to Cite

Issue

Section

Announcements

Call for Papers: Volume 14 Issue 3

ARTICLES
PUBLISHED

STUDENT
AUTHORS

YEARS
OF SERVICE

Depth vs. Complexity: A Comparative Study of Neural Network Architectures in Image Classification

Authors

DOI:

Keywords:

Abstract

Downloads

References or Bibliography

Published

How to Cite

Issue

Section

Announcements

Call for Papers: Volume 14 Issue 3

ARTICLESPUBLISHED

STUDENTAUTHORS

YEARSOF SERVICE

ARTICLES
PUBLISHED

STUDENT
AUTHORS

YEARS
OF SERVICE