Multilingual speech recognition patent needs caring home

Summary: Faster, smaller footprint, more modular multilingual ASR technique US patent 7,689,404 for sale/license

A novel technique of achieving multilingual (including regionals) speech recognition by combining existing single language recognizers has a promise of

Increased modularity and flexibility- additional languages can be added or removed without requiring major reconfiguration

Less complex multilingual implementation – faster time to market

Enable multi-language speech recognition with a smaller memory footprint – less cost

Faster execution times than alternative technologies – better user experience

Easier implementation

Lower overall system costs

Lower on-going support costs

Offering for sale or license, US Patent 7,689,404, a method of enabling multilingual speech recognition by reduction to single language recognizer engine components
The purpose of the invention is to significantly reduce the computational complexity of multilingual and/or large-vocabulary speech recognition, while simplifying the efforts of productizing.
In addition to the instance where the language of an utterance is not known in advance, applications exist where the language may also change during an utterance: “I shouldn’t have been schadenfreude but c’est la vie”.

Thus business needs increasingly dictate the need for multi-language speech recognition for a wide range of applications. This patented technology offers a more efficient means of enabling existing single language recognizers to support multiple languages simultaneously. This technology overcomes traditional challenges and complexities of multilingual speech recognition, such as creating hugely complex monolithic multilingual recognizers that support all languages. So,

Your existing single-language recognizers become reusable assets, and therefore
Overall system maintenance costs are reduced

In contrast, the method described in the patent utilizes existing components of single-language speech recognizer engines by combining and controlling them in a way that enables automatic multilingual speech recognition across a range of supported languages and dialects.

A new component, the ‘Multilingual Dispatcher’ (MLD) envelops language independent components and invokes language-specific components to perform language-dependent processing. The MLD dispatches certain requests to individual recognizers, aggregates their responses and keeps track of the recognized sequence. The dispatcher is agnostic to how the single-language recognizers work internally. Thus, the hypotheses space is decomposed into sub-spaces visible to individual recognizers, which reduces the complexity. Moreover, language-specific components themselves are not affected when a language is added or removed from the application. So,

Modular recognizers are added or removed without affecting the rest of the system
This simplifies implementation and improves time to market, and
Reduces incremental maintenance/upgrade/deployment costs caused by a single language

Smaller footprint and faster execution can be achieved, enabling smaller platforms and/or larger vocabularies

A ‘language’ is applicable for anything for which a recognizer exists, so the invention applies to both different spoken languages and different recognizer models or engines for different subsets of the same spoken language, such as regional or ethnic accents or gender differences. So,

Reduction of complexity of acoustic models is possible, e.g. in dimensionality and the number of “feature vectors”, thus

Further reducing footprint and improving speed

This naturally leads to language tagging. E.g. a mischievous toy or bot may react differently to commands issued by a male, a female or a young child.

Key elements of the invention include a heuristic way to make numeric scores of hypotheses (such as Viterbi scores) comparable even if produced by different language-specific recognizers and a heuristic way of propagation of (seeding) a hypothesis from a hypothesis in a different language. Specific language support works like a replaceable plug-in thus creating a structure that enables scalable deployment of any subset of supported languages in short order. So,

The complexity is at worst linear on the number of languages
Pruning of active unlikely hypotheses is automatically aggressive in unlikely languages

http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-adv.htm&r=1&f=G&l=50&d=PALL&S1=07689404&OS=PN/07689404&RS=PN/07689404

Search News

Latest comments

Keep up posted Steven!!! You better want to write an article on your own? We'll soon publish articles on our ...

Congratulations Expressive Inc. Very well done. You may also enjoy seeing https://www.facebook.com/groups/WaitingForMoose/ where lip-sync TTS, smiling ...

I have a very big demand. We go to have a platform with chatbots in the Eventfield Organisation. ...

Wow, this is like a year old and I didn't know about this. Odd.. I'm on Facebook literally every day ...

bots is well for information

Search News

Business Topics

Technology Topics

Tag Cloud

News Archive

Hot on AI Zone

Investors

Latest comments

Organizing Events?

Browse All Chatbot Categories

Chatbot Reviews

Science Statistics

Chatbot Statistics