Hi All,
I’m new to this forum.
I’m developing a new AI program that has some personality and self-awareness.
Also it is capable of leaning by the inputs as well as in the background. It has many API’s connected such as youtube, wikipedia, maps and yahoo weather reports.
At the moment, its working fine over text input. Its normalizing words and sending responses fairly quickly.
Now I’m thinking to integrate this in to a Siri like voice input output instead of text because my goal is to release a mobile app so that people can have a Jarvise (ION MAN) like personal assistant or a companion.
I did some research on the Internet and I found some open source Voice to text projects as well as paid services.
My problem is, I can see there is a direct impact for my data input part from voice recognition applications.
Even Apples Siri cannot identify some words properly and its retuning back with lots of
‘Sorry i didn’t get that’ phases. I don’t worry much about text o Voice play back.
So my question is, is there any popular or a recommended way to get the voice commands more accurately just like the text? Also if we use a private paid service, then each and every request will hit their server and then retuning back in to my program. This will add some extra delay for my system as well.
Thank you
Rush