Sign Language Gesture to Speech Conversion Using Convolutional Neural Network


Shreya Tope
Sadnyani Gomkar
Pukhraj Rathkanthiwar
Aayushi Ganguli
Pradip Selokar


A genuine disability prevents a person from speaking. There are numerous ways for people with this condition to communicate with others, including sign language, which is one of the more widely used forms of communication. Human body language can be used to communicate with one another using sign language, where each word is represented by a specific sequence of gestures.

The goal of the paper is to translate human sign language into speech that can interpret human gestures. Through a deep convolution neural network, we first construct the data-set, save the hand gestures in the database, and then use an appropriate model on these hand gesture visuals to test and train the system. When a user launches the application, it then detects the gestures that are saved in
the database and displays the corresponding results. By employing this system, it is possible to assist those who are hard of hearing while simultaneously making communication with them simpler for everyone else.


How to Cite
Tope, S., Gomkar, S., Rathkanthiwar, P., Ganguli, A., & Selokar, P. (2023). Sign Language Gesture to Speech Conversion Using Convolutional Neural Network. International Journal of Next-Generation Computing, 14(1).


  1. Abraham, A., and Rohini, V. Real time conversion of sign language to speech and prediction of gestures using artificial neural network. Procedia computer science 143 (2018), 587–594. DOI:
  2. Bohra, T., Sompura, S., Parekh, K., and Raut, P. Real-time two way communication system for speech and hearing impaired using computer vision and deep learning. In 2019 International Conference on Smart Systems and Inventive Technology (ICSSIT) (2019), IEEE, pp. 734–739. DOI:
  3. Bowden, R., Windridge, D., Kadir, T., Zisserman, A., and Brady, M. A linguistic feature vector for the visual interpretation of sign language. In European Conference on Computer Vision (2004), Springer, pp. 390–401. DOI:
  4. Charles, J., Pfister, T., Everingham, M., and Zisserman, A. Automatic and efficient human pose estimation for sign language videos. International Journal of Computer Vision 110, 1 (2014), 70–90. DOI:
  5. Hasan, M., Sajib, T. H., and Dey, M. A machine learning based approach for the detection and recognition of bangla sign language. In 2016 International Conference on Medical Engineering, Health Informatics and Technology (MediTec) (2016), IEEE, pp. 1–5. DOI:
  6. Kadir, T., Bowden, R., Ong, E.-J., and Zisserman, A. Minimal training, large lexicon, unconstrained sign language recognition. In BMVC (2004), pp. 1–10. DOI:
  7. Madhuri, Y., Anitha, G., and Anburajan, M. Vision-based sign language translation device. In 2013 International Conference on Information Communication and Embedded Systems (ICICES) (2013), IEEE, pp. 565–568. DOI:
  8. Ong, E.-J., Cooper, H., Pugeault, N., and Bowden, R. Sign language recognition using sequential pattern trees. In 2012 IEEE Conference on Computer Vision and Pattern Recognition (2012), IEEE, pp. 2200–2207.
  9. Rewari, H., Dixit, V., Batra, D., and Hema, N. Automated sign language interpreter. In 2018 Eleventh International Conference on Contemporary Computing (IC3) (2018), IEEE, pp. 1–5. DOI:
  10. Timande, A., Nagbhidkar, A., Chhattani, A., Nema, A., and Selokar, P. Smart vacubin with contactless waste collection. International Journal of Next-Generation Computing, 12(5) (11 2021). DOI:
  11. Zaki, M. M., and Shaheen, S. I. Sign language recognition using a combination of new vision based features. Pattern Recognition Letters 32, 4 (2011), 572–577. DOI: