Analyzing and Improving Existing Neural Network-Based Approaches to Identify AI Generated Images
DOI:
https://doi.org/10.47611/jsrhs.v13i1.5931Keywords:
Artificial Intelligence, AI-Generated Images, AI-Generated Faces, Detecting AI-Generated Faces, MobileNet-v2, StyleGANAbstract
Images produced by AI software, particularly human faces, have the potential to spread misinformation, thus detection tools are essential to tell facts from fiction. However, current neural network-based tools misclassify images with extreme photography conditions, facial obstructions, or post-processing. This project aimed to address the limitations of existing detection models, especially regarding poorly-photographed or edited images. A MobileNet-v2-based neural network to identify synthetic images was analyzed for possible shortcomings. After obtaining its baseline accuracy, it was stress tested on validation data with brightness, contrast, hue, and saturation edited by set increments. The model performed poorly on processed images and faces with glasses or shadows. Possible limitations were insufficient data augmentation, the Global Average Pooling Layer, too few epochs trained, and inadequate regularization strength or dropout rate. The model was then modified to address these shortcomings by altering the brightness, contrast, hue, and saturation of training data, adding Gaussian noise to training images, implementing a Global Max Pooling Layer, augmenting the number of epochs trained, and changing the regularization strength and dropout rate. The modified model was evaluated on augmented and non-augmented images. Overall, the model improved when Gaussian noise was added with standard deviation 0.1, it was trained for 15 epochs, and there were no dropout layers. This was because the Gaussian noise layer approximated degraded image quality, and the model was too simple to learn image features when trained for fewer epochs or when neurons were disabled. This research could be expanded by testing multiple modifications at once.
Downloads
References or Bibliography
References
Rawnsley, A. (2020, July 6). Right-Wing Media Outlets Duped by a Middle East Propaganda Campaign. The Daily Beast. http://www.thedailybeast.com/right-wing-media-outlets-duped-by-a-middle-east-propaganda-campaign
Allyn, B. (2022, March 16). Deepfake Video of Zelenskyy Could Be “Tip of the Iceberg” in Info War, Experts Warn. NPR. https://www.npr.org/2022/03/16/1087062648/deepfake-video-zelenskyy-experts-war-manipulation-ukraine-russia
Galer, S. S. (2023, January 27). Someone Made AI Videos of “Americans” Backing a Military Coup in West Africa. Vice. http://www.vice.com/en/article/v7vw3a/ai-generated-video-burkino-faso-coup
Wang, X., Guo, H., Hu, S., Chang, M.-C., & Lyu, S. (2023, May 4). GAN-generated Faces Detection: A Survey and New Perspectives. ArXiv.org. https://doi.org/10.48550/arXiv.2202.07145
Kabber, P., Lai, S., Oh, S. Y., & Wang, E. (2023). Detecting AI-Generated Fake Images. https://github.com/nogibjj/Detecting-AI-Generated-Fake-Images/blob/main/final_report/IDS705%20Final%20Report.pdf
Lago, F., Pasquini, C., Böhme, R., Dumont, H., Goffaux, V., & Boato, G. (2022). More Real Than Real: A Study on Human Visual Perception of Synthetic Faces [Applications Corner]. IEEE Signal Processing Magazine, 39(1), 109–116. https://doi.org/10.1109/msp.2021.3120982
Lu, Z., Huang, D., Bai, L., Liu, X., Qu, J., & Ouyang, W. (2023, April 25). Seeing is not always believing: A Quantitative Study on Human Perception of AI-Generated Images. ArXiv.org. https://doi.org/10.48550/arXiv.2304.13023
Bird, J. J., & Lotfi, A. (2023). CIFAKE: Image Classification and Explainable Identification of AI-Generated Synthetic Images. https://doi.org/10.48550/arxiv.2303.14126
Hulzebosch, N., Ibrahimi, S., & Worring, M. (2020, June 1). Detecting CNN-Generated Facial Images in Real-World Scenarios. IEEE Xplore. https://doi.org/10.1109/CVPRW50498.2020.00329
Zhu, M., Chen, H., Yan, Q., Huang, X., Lin, G., Li, W., Tu, Z., Hu, H., Hu, J., & Wang, Y. (2023). GenImage: A Million-Scale Benchmark for Detecting AI-Generated Image. https://doi.org/10.48550/arxiv.2306.08571
Karras, T., Laine, S., & Aila, T. (2018). A Style-Based Generator Architecture for Generative Adversarial Networks. ArXiv.org. https://arxiv.org/abs/1812.04948
Dullaz. (2021). 1M AI generated faces 128x128. Kaggle. https://www.kaggle.com/datasets/dullaz/1m-ai-generated-faces-128x128
MobileNet-v2 convolutional neural network - MATLAB mobilenetv2. (n.d.). MathWorks. https://www.mathworks.com/help/deeplearning/ref/mobilenetv2.html
PyTorch ResNet. (n.d.). Run:ai. https://www.run.ai/guides/deep-learning-for-computer-vision/pytorch-resnet
Pröve, P.-L. (2018, April 11). MobileNetV2: Inverted Residuals and Linear Bottlenecks. Medium. https://medium.com/m/global-identity-2?redirectUrl=https%3A%2F%2Ftowardsdatascience.com%2Fmobilenetv2-inverted-residuals-and-linear-bottlenecks-8a4362f4ffd5
Olu-Ipinlaye, O. (2022, September 30). Global Pooling in Convolutional Neural Networks. Paperspace Blog. https://blog.paperspace.com/global-pooling-in-convolutional-neural-networks/
Keras Team. (n.d.). Keras documentation: Layer weight regularizers. Keras. https://keras.io/api/layers/regularizers/
Published
How to Cite
Issue
Section
Copyright (c) 2024 Sanika Rewatkar
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.