Audio Classification of Bird Species Using Convolutional Neural Networks
DOI:
https://doi.org/10.47611/jsrhs.v12i1.4108Keywords:
Sound classification, CNN, SpectrogramAbstract
As the total number of birds has declined in the billions over the last 50 years, an accurate method for classifying bird species is necessary for conservation efforts and population monitoring. One promising method is using machine learning models to classify birds by their sounds, which has emerged due to benefits such as being less affected by environmental factors (eg. habitat, time of day), and lower disturbances to bird species during the data collection process, contrary to other processes such as image classification. As audio processing may eventually become the main method of classifying birds and may be used as an important conservation tool, it is imperative to understand the challenges that must be overcome before it can be successfully applied. In this work, the programming language Python and the machine learning model Convolutional Neural Networks were used to process and classify audio recordings from over 150 different bird species. This study demonstrates that although audio classification is a promising method of classification, many challenges are still present in the field, such as the amount of variety in the different calls of a single bird, the presence of background noises in many audio recordings, and the difficulty in efficiently representing an audio signal with images, highlighting the importance of overcoming these challenges for conservation efforts.
Downloads
References or Bibliography
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., ... & Zheng, X. (2016). Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467. https://doi.org/10.48550/arXiv.1603.04467
Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., ... & Farhan, L. (2021). Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. Journal of big Data, 8(1), 1-74. https://doi.org/10.1186/s40537-021-00444-8
Barrowclough, G. F., Cracraft, J., Klicka, J., & Zink, R. M. (2016). How many kinds of birds are there and why does it matter? PLoS ONE 11(11): e0166307. https://doi.org/10.1371/journal.pone.0166307
BirdCLEF 2022 [data]. (2022). kaggle. Retrieved November 30, 2022, from https://www.kaggle.com/competitions/birdclef-2022/data
Budiman, I., Ramdania, D. R., Gerhana, Y. A., Putra, A. R. P., Faizah, N. N., & Harika, M. (2022, September). Classification of Bird Species using K-Nearest Neighbor Algorithm. In 2022 10th International Conference on Cyber and IT Service Management (CITSM) (pp. 1-5). IEEE. https://doi.org/10.1109/CITSM56380.2022.9936012
Gao, R. X., & Yan, R. (2006). Non-stationary signal processing for bearing health monitoring. International journal of manufacturing research, 1(1), 18-40. https://doi.org/10.1504/IJMR.2006.010701
Ghani, B., & Hallerberg, S. (2021). A randomized bag-of-birds approach to study robustness of automated audio based bird species classification. Applied Sciences, 11(19), 9226. https://doi.org/10.3390/app11199226
Giannakopoulos, T. (2015). pyaudioanalysis: An open-source python library for audio signal analysis. PloS one, 10(12), e0144610. https://doi.org/10.1371/journal.pone.0144610
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press. Retrieved from: http://www.deeplearningbook.org
Harris, C. R., Millman, K. J., Van Der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., ... & Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585(7825), 357-362. https://doi.org/10.1038/s41586-020-2649-2
Kahl, S., Wilhelm-Stein, T., Hussein, H., Klinck, H., Kowerko, D., Ritter, M., & Eibl, M. (2017). Large-Scale Bird Sound Classification using Convolutional Neural Networks. In CLEF (working notes) (Vol. 1866).
Kalisińska, E. (Ed.). (2019). Mammals and birds as bioindicators of trace element contaminations in terrestrial environments: an ecotoxicological assessment of the Northern Hemisphere. Springer. https://doi.org/10.1007/978-3-030-00121-6
Lepczyk, C. A. (2005). Integrating published data and citizen science to describe bird diversity across a landscape. Journal of Applied Ecology, 42(4), 672-677. https://doi.org/10.1111/j.1365-2664.2005.01059.x
McFee, B., Metsai A., McVicar M., Balke S., Thomé C., Raffel C., Zalkow F., Malek A., Dana, Lee K., Nieto O., Ellis Dan., Mason J., Battenberg E., Seyfarth S.,Yamamoto R., Morozov V., Morozov R., Choi K., Moore J., … Kim T. (2022). librosa/librosa: 0.9.2 (0.9.2). Zenodo. https://doi.org/10.5281/zenodo.6759664
O'Shaughnessy, D. (1987). Speech communications: Human and machine (IEEE). Universities press.
Pérez-Granados, C., Bota, G., Giralt, D., Barrero, A., Gómez-Catasús, J., Bustillo-De La Rosa, D., & Traba, J. (2019). Vocal activity rate index: a useful method to infer terrestrial bird abundance with acoustic monitoring. Ibis, 161(4), 901-907. https://doi.org/10.1111/ibi.12728
Pérez-Granados, C., & Traba, J. (2021). Estimating bird density using passive acoustic monitoring: a review of methods and suggestions for further research. Ibis, 163(3), 765-783. https://doi.org/10.1111/ibi.12944
Ramashini, M., Abas, P. E., Mohanchandra, K., & De Silva, L. C. (2022). Robust cepstral feature for bird sound classification. International Journal of Electrical and Computer Engineering, 12(2), 1477-1487. https://doi.org/10.11591/ijece.v12i2.pp1477-1487
Rosenberg, K. V., Dokter, A. M., Blancher, P. J., Sauer, J. R., Smith, A. C., Smith, P. A., ... & Marra, P. P. (2019). Decline of the North American avifauna. Science, 366(6461), 120-124. https://doi.org/10.1126/science.aaw1313
Roslan, R., Nazery, N. A., Jamil, N., & Hamzah, R. (2017, October). Color-based bird image classification using Support Vector Machine. In 2017 IEEE 6th Global Conference on Consumer Electronics (GCCE) (pp. 1-5). IEEE. https://doi.org/10.1109/GCCE.2017.8229492
Sekercioglu, Ç. H., Wenny, D. G., & Whelan, C. J. (Eds.). (2016). Why birds matter: avian ecological function and ecosystem services. University of Chicago Press. https://doi.org/10.1111/jofo.12214
Tim Sainburg. (2019). timsainb/noisereduce: v1.0 (db94fe2). Zenodo. https://doi.org/10.5281/zenodo.3243139
Verstraeten, W. W., Vermeulen, B., Stuckens, J., Lhermitte, S., Van der Zande, D., Van Ranst, M., & Coppin, P. (2010). Webcams for bird detection and monitoring: A demonstration study. Sensors, 10(4), 3480-3503. https://doi.org/10.3390/s100403480
Wang, H., Xu, Y., Yu, Y., Lin, Y., & Ran, J. (2022). An Efficient Model for a Vast Number of Bird Species Identification Based on Acoustic Features. Animals, 12(18), 2434. https://doi.org/10.3390/ani12182434
Yang, S., Frier, R., & Shi, Q. (2021, February). Acoustic classification of bird species using wavelets and learning algorithms. In 2021 13th International Conference on Machine Learning and Computing (pp. 67-71). https://doi.org/10.1145/3457682.3457692
Published
How to Cite
Issue
Section
Copyright (c) 2023 Jocelyn Wang; Guillermo Goldsztein
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.