Priyanjali Gupta, an engineering student, has created a unique artificial intelligence model capable of translating sign language into English in real time.
By: Ma. Andrea M. Pacardo
We are still in the grip of the pandemic, and electronic communications have become an inevitable aspect of life- on which a visionary engineer from India realized that people who use sign language couldn’t communicate well during video conversations. As a result, she materialized the idea of creating a one-of-a-kind AI model capable of translating gestures in order to make deaf people’s lives easier.
The model was created with Real-Time Sign Language Detection by Priyanjali Gupta, an engineering student in her third year at India’s Vellore Institute of Technology utilizing the Tensorflow object detection API, which translates hand motions using transfer learning from a pre-trained model called ssd_mobilenet. She uploaded her innovation on LinkedIn, where it received over 60,000 likes and about 1,200 comments from individuals who were amazed with her unique idea.
Gupta’s software converts signals into English text by using image recognition technology to analyze the actions of different body parts, such as arms and fingers. Since it is real-time, the AI program provides a dynamic manner of communicating with deaf or hard-of-hearing persons. Yes, No, Please, Thank You, I Love You, and Hello are the six gestures it can covert. A larger amount of sign language data would be required to construct a more trustworthy model, but as a proof of concept, this appears to work.
THANK YOU, MOM!
Gupta attributed the software’s inspiration to her mother, who encouraged her to try something new as an engineering student. Priyanjali stated that the Alexa device, which responds to voice commands is useless to individuals who couldn’t talk or hear, and that’s when the idea of developing an AI model occurred to her.
HARDWORK DOES PAY OFF!
Priyanjali began developing the concept in February 2021. She began the model in December of last year, but it did not go well. She tried again last month with the help of the internet and did all through her own research, recalling that she didn’t get any sleep for three nights in a row to complete the model.
MORE TO COME!
Since the dataset was created and annotated by hand using a computer webcam and is now trained on single frames, it will need to be developed on a huge number of frames to recognize videos, which will most likely be performed using Long short-term memory (LSTM) networks. While admitting that developing software for sign recognition is tough, Gupta expressed hope that the open-source community will soon help her in finding a solution that will allow her to expand her work.
Nonetheless, it is still in its early stages, and full implementation is likely to take some time. She is currently looking for a better platform and guidance in order to improve the model.
Image credit: Github, The Courier, ZME Science