AiShifu: AI Karate Pose Trainer Using Human Pose Estimation
DOI:
https://doi.org/10.47611/jsrhs.v12i3.5063Keywords:
AI; HPE; HRNET, KARATEAbstract
One application of the artificial intelligence (AI) technology is self-guided physical activities where a computing device acts as a trainer. One key challenge for these applications is how to measure the performance of such an AI trainer, especially when the AI trainer is run on a generic PC or a mobile device. In the spirit of the Turing test, an AI trainer should mimic the behavior of a human trainer. A good human trainer generally considers the training history and the level of the trainee when providing feedback, which requires more than body position analysis. In this work, we built a Martial Art trainer application called AIShifu that helps users practice martial art poses using Human Pose Estimation (HPE). We chose an open-source neural network called HRNET trained with MS-COCO dataset as the core of the HPE. The joint coordinates and angles were used to identify the pose being practiced by the trainee, whether the active side is left or right, and how close is the key joint angle to that from a “golden” image. We collected data from both a black belt martial artist and a novice trainee and on three Karate poses. Based on the data, it is clear that the blackbelt performed the poses more consistently. A much larger sample size was required to test how well an AI trainer can discern the difference between trainees with different levels of proficiency. This understanding forms the foundation to customize AI trainer softwares for different users.
Downloads
References or Bibliography
https://zenia.app. Yoga trainer, Retrieved on March 8, 2023
https://www.hit.coach, Retrieved on March 8, 2023.
Sun, K., Xiao, B., Liu, D., & Wang, J. (2019). Deep high-resolution representation learning for human pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.5693-5703.
https://github.com/HRNet. Retrieved on March 8, 2023.
Yu, C., Xiao, B., Gao, C., Yuan, L., Zhang, L., Sang, N., & Wang, J. (2021). Lite-hrnet: A lightweight high-resolution network. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10440-10450).. https://doi.org/10.1109/cvpr46437.2021.01030
https://github.com/HRNet/Lite-HRNet Retrieved on May 30, 2023
Sun, K., Zhao, Y., Jiang, B., Cheng, T., Xiao, B., Liu, D., Mu, Y., Wang, X., Liu, W., & Wang, J. (2019). High-Resolution Representations for Labeling Pixels and Regions. arXiv:1904.04514. https://arxiv.org/pdf/1904.04514.pdf
Dittakavi, B., Bavikadi, D., Desai, S. V., Chakraborty, S., Reddy, N., Balasubramanian, V. N., Callepalli, B. & Sharma, A. (2022), Pose tutor: An explainable system for pose correction in the wild, in ‘Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops’, pp. 3540– 3549.
Aju, A., Mathew, C., & Prakasi, O. S. G. (2022). PoseNet based Model for Estimation of Karate Poses. Journal of Innovative Image Processing, 4(1), 16–25. https://doi.org/10.36548/jiip.2022.1.002
Ait-Bennacer, F.-E., Aaroud, A., Akodadi, K., & Cherradi, B. (2022). Applying Deep Learning and Computer Vision Techniques for an e-Sport and Smart Coaching System Using a Multiview Dataset: Case of Shotokan Karate. International Journal of Online and Biomedical Engineering (iJOE), 18(12), 1-14. https://doi.org/10.3991/ijoe.v18i12.30893
Zhang, X., Wu, X., & Song, L. (2022). Arm Movement Analysis Technology of Wushu Competition Image Based on Deep Learning. Computational Intelligence and Neuroscience, 2022, 9866754. https://doi.org/10.1155/2022/9866754
Kamel, A., & Liu, B. L. P. a. S. B. (2019). An investigation of 3D human pose estimation for learning Tai Chi: A human factor perspective. International Journal of Human-Computer Interaction, 35(4-5), 427-439.
Thanh, N. T., Hung L. V., & Cong, P. T. (2019). An evaluation of pose estimation in video of traditional martial arts presentation. Research and Development on Information and Communication Technology, 2019(2), 1-8. https://doi.org/10.32913/mic-ict-research.v2019.n2.864
Contributors, M. (2020). Openmmlab pose estimation toolbox and benchmark.https://github.com/open-mmlab/mmpose
Newell, A., Huang, Z., & Deng, J. (2017). Associative embedding: End-to-end learning for joint detection and grouping. Advances in neural information processing systems, 30.
Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., ... & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13 (pp. 740-755). Springer International Publishing.
Published
How to Cite
Issue
Section
Copyright (c) 2023 Frederick Lu
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.