AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Chatbot database
 
 
  [ # 16 ]

Hi CR, Welcome to Chatbots.org Forums ! grin.

The performance issue is returning very often. Actually what we should do it to identify methods to build fasst chatbot implementations. What about a new thread solely on Optimizing Chatbot Performance?

 

 
  [ # 17 ]

@CR: Yes, the amount of processing required with my algorithm is immense.  My first implementation was a ‘recursive decent’ approach, which, when the input reached 20 words it would take about 10 minutes to generate all trees - yes, unacceptable. 

However, since then, I have rewritten a huge portion of that part of the bot engine and speeds are now acceptable.  Example, a 20 word input takes about 3 or 4 seconds.  Now don’t forget, this is in a single thread.  Later, I will figure out how I can have a main process spawn other threads to work in parallel, and run it on a quad core cpu smile 

I’ll ask our ‘C, C++, C#’ expert, Chuck, on how I can do threading.  Also, like you, I am using an interpreted, dynamic language (also known as ‘scripting language’) Perl.  Later, I will convert to C—I have even thought of coding some of the very cpu intensive functions in assembly.  Very tight and efficient code smile

Basically I have my ‘parse tree generators’ and then a ‘driver’ to call them.  The old way, recursive decent, was too CPU hungry.  I basically turned the ‘ptgen driver’ algorithm upside down : now it works bottom up, instead of top down.  That seemed to make all the difference in the world.

Also, another very important thing I make extensive use of is caching.  Often, 2 parse tree generators will be trying to figure out the same thing, so it is very important to cache that, so first thing a ptgen (parse tree generator) does is check if the answer to its question is already in the cache, if not, it figures it out and puts the result in the cache so the next ptgen doesn’t have to re-figure that out.

The cache is cleared on the next user input OR if you tell the bot a new word, or edit a word’s def.

Something else to remember - computers are still using the ‘Von Neumann architecture’ - which sure wasn’t designed for something like NLP, but rather ‘programmable calculators’.  Their may be no way to do this without employing several CPUs, perhaps racks of CPUs…*or* some very clever efficient ‘serial’ algorithm designs.  Perhaps an NLP algorithm could be implemented as an ASIC (application-specific integrated circuit).

@Erwin - that maybe not be a bad idea.

 

 
  [ # 18 ]

Python nor Perl are not that much different in speed than C.  There are both written in C.

The verb “is” is a funny one when trying to make facts in a database.  I can easily think of three different ways the same grammer and structure of the most simple sentences change when different words are substituted.

My fruit is orange.
My fruit is the orange.
My fruit is an orange.

And actually sometimes these same sentences do not use a determinator to differentiate the meanings.

My car is the old one.
My car is the oldsmobile.

Here we have an object linking to an object and an object linking to a class of objects (or object type.)  The fruit example also shows an object linking to a modifier.

One of the best systems I’ve run across for storing data is out of Standford called FrameNet.  The words are not only defined and related to associated concepts, but they also include the useage and constraints on how the word or word sense is commonly applied.  In that way you can tell when “orange” is a color or a specific object or a type of object, etc.

 

 < 1 2
2 of 2
 
  login or register to react