AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Impact on voice synthesiser for input
 
 

Hi All,

I’m new to this forum.

I’m developing a new AI program that has some personality and self-awareness.
Also it is capable of leaning by the inputs as well as in the background. It has many API’s connected such as youtube, wikipedia, maps and yahoo weather reports.

At the moment, its working fine over text input. Its normalizing words and sending responses fairly quickly.

Now I’m thinking to integrate this in to a Siri like voice input output instead of text because my goal is to release a mobile app so that people can have a Jarvise (ION MAN) like personal assistant or a companion.

I did some research on the Internet and I found some open source Voice to text projects as well as paid services.
My problem is, I can see there is a direct impact for my data input part from voice recognition applications.

Even Apples Siri cannot identify some words properly and its retuning back with lots of
‘Sorry i didn’t get that’ phases.  I don’t worry much about text o Voice play back.

So my question is,  is there any popular or a recommended way to get the voice commands more accurately just like the text?  Also if we use a private paid service, then each and every request will hit their server and then retuning back in to my program. This will add some extra delay for my system as well.


Thank you
Rush

 

 

 

 
  [ # 1 ]

> quora.com/profile/Marcus-L-Endicott/answers/Speech-Synthesis

FYI, speech synthesis is generally considered output.  Currently, I’ve answered 10 questions on Quora under the speech synthesis topic.

> quora.com/profile/Marcus-L-Endicott/answers/Speech-Recognition

Whereas, speech recognition is input.  I’ve answered 23 questions on Quora about speech recognition, which you might find helpful….

- How can I learn to build a speech recognition app

- What are good speech recognition solutions for commercial use

- What is the intersection of natural language processing and sound or music interpretation methods

- How can I develop a book that will listen to someone’s problem, and provide the solution

- What are the best open source options available for speech to text conversion

- What are the best tools for converting speech to text

- What algorithms/technologies were used to make Siri

- I am working on a voice to text conversion system in MATLAB. How do I proceed

- What are the other voice-based (artificial intelligence) apps like Google Now, multiverse extreme, Siri, Cortana

- How do I create a voice assistant app for Android

- What are the software tools that can convert symbolically represented speech into text or written script

- Why is Google Voice’s speech-to-text engine so lousy? (see details)

- What are some voice control apps on Android

- What is good Linux software for voice assistance

- Is it possible to integrate all customer service numbers into Google speech recognition, so we can directly ask questions

- How do I make a robot which starts by hearing the command “on” and stops with “off”

- What is the current relationship between prosody in linguistics, speech recognition, and affective computing

- How do I make a speech recognition system which executes my voice commands? Which language should I use for this?

- What open-sourced and accurate speech-to-text engines and APIs currently exist

- Is anyone working on an open source version of Siri

- What are the similar companies to Siri

- What is the outlook for voice search

- What does smartaction.com do? What’s their technology based on?

 

 
  login or register to react