GuideDogNet: A Deep Learning Model for Guiding the Blind in Walking Environments
DOI:
https://doi.org/10.47611/jsrhs.v10i4.2064Keywords:
artificial intelligence, guiding blind, object detection, depth predictionAbstract
A guide dog is a critical companion for the blind, which enables independent travel of the blind. However, due to the costly and time-consuming training process, only 1.7% of the blind who wish to adopt a guide dog can take it. In order to alleviate this social problem, previous studies have suggested several blind guiding systems heavily based on hardware devices, such as GPS(Global Positioning System), RFID(Radio-Frequency Identification), and ultrasonic devices. However, those techniques lack administrative feasibility to use in real-world environments. Moreover, those techniques are deficient in warning of obstacles, which makes the system non-user-friendly. To guide the blind universally and provide accurate information about the obstacles without cumbersome devices, we propose a novel deep learning-based blind guiding system, GuideDogNet. The proposed system consists of an object detection network, depth prediction network, and post-processing module. To provide user-friendly outputs for the blind, we propose a rule-based post-processing module that outputs the label, direction, and distance of the obstacles by combining the results of the object detection network and the depth prediction network. We achieved an mAP of 67.8 on the AI Hub Sidewalks dataset which is publicly available. To the best of our knowledge, this is the first attempt at a deep learning-based blind guiding system.
The code will be available on https://github.com/Yunseo-Hwang/AI-GuideDog
Downloads
References or Bibliography
Wei, Yuanlong, Xiangxin Kou, and Min Cheol Lee. "A new vision and navigation research for a guide-dog robot system in urban system." 2014 IEEE/ASME International Conference on Advanced Intelligent Mechatronics. IEEE, 2014.
Hapsari, Gita Indah, Giva Andriana Mutiara, and Dicky Tiara Kusumah. "Smart cane location guide for blind Using GPS." 2017 5th International Conference on Information and Communication Technology (ICoIC7). IEEE, 2017.
Na, Jongwhoa. "The blind interactive guide system using RFID-based indoor positioning system." International Conference on Computers for Handicapped Persons. Springer, Berlin, Heidelberg, 2006.
Harsur, Anushree, and M. Chitra. "Voice based navigation system for blind people using ultrasonic sensor." IJRITCC 3 (2017): 4117-4122.
DETR (Detection Transformer): Zhu, Xizhou, et al. "Deformable DETR: Deformable Transformers for End-to-End Object Detection." arXiv preprint arXiv:2010.04159 (2020).
DPT (Dense Prediction Transformer): Ranftl, René, Alexey Bochkovskiy, and Vladlen Koltun. "Vision transformers for dense prediction." arXiv preprint arXiv:2103.13413 (2021).
Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.
Girshick, Ross. "Fast r-cnn." Proceedings of the IEEE international conference on computer vision. 2015.
Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
Liu, Wei, et al. "Ssd: Single shot multibox detector." European conference on computer vision. Springer, Cham, 2016.
Garg, Ravi, et al. "Unsupervised cnn for single view depth estimation: Geometry to the rescue." European conference on computer vision. Springer, Cham, 2016.
Liu, Fayao, Chunhua Shen, and Guosheng Lin. "Deep convolutional neural fields for depth estimation from a single image." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
Kingma, Diederik P., and Jimmy Ba. "Adam: A method for stochastic optimization." arXiv preprint arXiv:1412.6980 (2014).
Liu, Ze, et al. "Swin transformer: Hierarchical vision transformer using shifted windows." arXiv preprint arXiv:2103.14030 (2021).
Published
How to Cite
Issue
Section
Copyright (c) 2021 Yunseo Hwang; Taeseon Yoon, Kyuyong Park
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.