Automated Sign to Speech Conversion Model using Deep Learning

##plugins.themes.academic_pro.article.main##

Dr. Premanand Ghadekar
Divsehaj Singh Anand
Aryan Kumar Gupta
Dheeraj Sharma
Preeti Oswal
Shreyas Khare

Abstract

Unable to communicate verbally is a disability. In order to exchange thoughts and interact, there exist several
ways. The most predominant method involves use of hand-gestures. The prime motive of the proposed research
work is to bridge the research gap in Sign Language Recognition with maximum efficiency. The goal is to replace
the human mediator with a machine to minimize human interference. This paper focuses on the recognition of ASL
in real-time. In automatic sign language translator design the challenging part lies in selecting a good classifier
to classify the static input gestures with high accuracy. CNN architecture is used to design a classifier for sign
language recognition in the proposed system. The model and the pipeline architecture is developed by keras based
convolutional neural network to classify 27 alphabets that is 26 English language alphabets and a unique character,
space. With different parameter configurations, the system has trained the classifier with different parameters and
tabulated the results. The proposed study achieved an efficiency of 99.88% on the test set. The result shows that
the model accuracy improves as more data is fetched from various subjects for training.

##plugins.themes.academic_pro.article.details##

Author Biographies

Dr. Premanand Ghadekar, Professor & Head, Department of Information Technology, Vishwakarma Institute of Technology, Pune

Dr.Premanand P Ghadekar. / http://orcid.org/0000-0003-3134137X. He received
Ph.D. degree from SGBA University, Amravati in 2016. He has completed M.Tech degree in Electronics (Computer) from College of Engineering, Pune in the year 2008. He
received BE degree in Electronics and Telecommunication Engineering from Government
College of Engineering, Amravati, in 2001. In 2003, he joined the Department of Computer Engineering, Vishwakarma Institute of Technology, Pune, Maharashtra, India. He
is working as a Professor and Head of Information Technology Department, VIT Pune.
His areas of research are IoT, Machine Learning, Deep Learning, Image Processing, Computer Vision, Video Processing, Dynamic Texture Synthesis and Embedded System. He
is having two patents on his credit. He has contributed 12 papers in International conferences, 16 paper in International Journals and 8 papers in Springer book series. He is
a life member of ISTE, member of CSI and member of IEEE. He has completed research
project of two lakhs.

Divsehaj Singh Anand, Student, Department of Information Technology, Vishwakarma Institute of Technology, Pune, India

Mr.Divsehaj Singh Anand. He is an undergraduate student at Vishwakarma Institute
of Technology, Pune pursuing B.Tech in Information Technology. His research domain
includes Deep Learning, Machine Learning, Natural Language Processing and Business
Analytics. He has published 2 research papers in reputed Scopus indexed and Web of
Science indexed journals and conferences and has internship experience at Fortune 500
companies.He is also an active member of Computer Society of India, VIT Pune Chapter
and has also served as the presidential body member of the team.

Aryan Kumar Gupta, Student, Department of Information Technology, Vishwakarma Institute of Technology, Pune, India

Mr.Aryan Kumar Gupta. He is an undergraduate at the Vishwakarma Institute of
Technology, Pune pursuing Bachelor’s in Information Technology, Batch of 2022. He has
experience working with a research lab at the Indian Institute of Technology, Kharagpur,
and an intern at NVIDIA. His research interest includes Deep Learning, Natural Language
Processing, Computer Vision, Recommendation Systems.
He has published 2 research papers on his name, one of those in Springer, and a more is
under review at Elsevier Journal.

Dheeraj Sharma, Student, Department of Information Technology, Vishwakarma Institute of Technology, Pune, India

Mr.Dheeraj R Sharma. He is an undergraduate at Vishwakarma Institute of Technology
Pune pursuing bachelors in Information Technology branch(Batch of 2022).His research
Domain include Machine Learning, Deep Learning, Computer Vision, Applied Science.
He have published 2 research paper in his name one of those in Springer publications. He
have been an active IEEE member and have been serving IEEE chapter of VIT as HEAD.

Preeti Oswal, Student, Department of Information Technology, Vishwakarma Institute of Technology, Pune, India

Ms.Preeti Oswal. She is an B.Tech student of Vishwakarma Institue of Technology. She
has worked at Tata Communications as Project Trainee for 6 months in 2020-2021. She
has worked as an Intern at Vmware in 2021. She has a experience of working in Java,
Spring Boot, Angular, C++, Machine Learning, Deep Learning.

Shreyas Khare, Student, Department of Information Technology, Vishwakarma Institute of Technology, Pune, India

Mr.Shreyas V Khare. He is an undergraduate at Vishwakarma Institute of Technology
Pune pursuing bachelors in Information Technology branch(Batch of 2022). His research
Domain include Artificial Intelligence , Machine Learning , Deep Learning , Computer
Vision. He have published 2 research paper in his name one of those in Springer publications.

How to Cite
Ghadekar, P., Anand, D., Gupta, A. ., Sharma, D. ., Oswal, P. ., & Khare, S. . (2021). Automated Sign to Speech Conversion Model using Deep Learning. International Journal of Next-Generation Computing, 12(5). https://doi.org/10.47164/ijngc.v12i5.427

References

  1. Abraham, Abey amp; .V, R. 2018. Real-time conversion of sign language to speech and prediction of gestures using artificial neural network. Procedia Computer Science. Vol.10, No.435.
  2. Chen, B., Deng, W., and Du, J. 2017. Noisy softmax: Improving the generalization ability of dcnn via postponing the early softmax saturation.
  3. Dumitrescu, D. and Boiangiu, C.-A. 2019. A study of image upsampling and downsampling filters. Computers 8, 2, 30.
  4. Gedraite, E. S. and Hadad, M. 2011. Investigation on the effect of a gaussian blur in image filtering and segmentation. In Proceedings ELMAR-2011. IEEE, 393–396.
  5. Kang, L., Ye, P., Li, Y., and Doermann, D. 2014. Convolutional neural networks for noreference image quality assessment. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1733–1740.
  6. Liu, W., Islamaj Dogan, R. ˘ , Kim, S., Comeau, D. C., Kim, W., Yeganova, L., Lu, Z., and Wilbur, W. J. 2014. Author name disambiguation for p ub m ed. Journal of the Association for Information Science and Technology 65, 4, 765–781.
  7. Manikandan, K., Patidar, A., Walia, P., and Roy, A. B. 2018. Hand gesture detection and conversion to speech and text. arXiv preprint arXiv:1811.11997 .
  8. Newby, P. R. 2011. Accuracy, precision, extraction, citation and valediction. Pan, Z., Yu, W., Yi, X., Khan, A., Yuan, F., and Zheng, Y. 2019. Recent progress on generative adversarial networks (gans): A survey. IEEE Access 7, 36322–36333.
  9. Pramada, S., Saylee, D., Pranita, N., Samiksha, N., and Vaidya, M. 2013. Intelligent sign language recognition using image processing. IOSR Journal of Engineering (IOSRJEN) 3, 2, 45–51.