DeafTech Vision: A Visual Computer's Approach to Accessible Communication through Deep Learning-Driven ASL Analysis

  • Shafayat Bin Shabbir Mugdha Department of Computer Science & Engineering, United International University, Dhaka-1212, Bangladesh
  • Hridoy Das Department of Computer Science & Engineering, United International University, Dhaka-1212, Bangladesh
  • Mahtab Uddin Institute of Natural Sciences, United International University, Dhaka-1212, Bangladesh https://orcid.org/0000-0002-6526-9276
  • Md. Easin Arafat Data Science and Engineering Department, Faculty of Informatics, Eötvös Loránd University, Pázmány Péter str. 1/A, 1117 Budapest, Hungary
  • Md. Mahfujul Islam Department of Computer Science & Engineering, United International University, Dhaka-1212, Bangladesh
Keywords: Deaf Community, American Sign Language, Convolutional Neural Network, Recurrent Neural Network, Hearing-Impaired, Rectified Linear Unit, Cambridge Hand Gesture

Abstract

Sign language is commonly used by people with hearing and speech impairments, making it difficult for those without such disabilities to understand. However, sign language is not limited to communication within the deaf community alone. It has been officially recognized in numerous countries and is increasingly being offered as a second language option in educational institutions. In addition, sign language has shown its usefulness in various professional sectors, including interpreting, education, and healthcare, by facilitating communication between people with and without hearing impairments. Advanced technologies, such as computer vision and machine learning algorithms, are used to interpret and translate sign language into spoken or written forms. These technologies aim to promote inclusivity and provide equal opportunities for people with hearing impairments in different domains, such as education, employment, and social interactions. In this paper, we implement a DeafTech Vision (DTV-CNN) architecture based on the convolutional neural network to recognize American Sign Language (ASL) gestures using deep learning techniques. Our main objective is to develop a robust ASL sign classification model to enhance human-computer interaction and assist individuals with hearing impairments. Through extensive evaluation, our model consistently outperformed baseline methods in terms of precision. It achieved an outstanding accuracy rate of 99.87% on the ASL alphabet test dataset and 99.94% on the ASL digit dataset, significantly exceeding previous research, which reported an accuracy of 90.00%. We also illustrated the model's learning trends and convergence points using loss and error graphs. These results highlight the DTV-CNN's effectiveness and capability in distinguishing complex ASL gestures.
Published
2024-06-13
How to Cite
Mugdha, S. B. S., Das, H., Uddin, M., Arafat, M. E., & Islam, M. M. (2024). DeafTech Vision: A Visual Computer’s Approach to Accessible Communication through Deep Learning-Driven ASL Analysis. Statistics, Optimization & Information Computing. https://doi.org/10.19139/soic-2310-5070-2020
Section
Research Articles