AI Zone Admin Forum Add your forum

NEWS: survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

An NLP + AI chat framework: Description and Discussion
  [ # 16 ]
Gary Dubuque - Apr 26, 2011:


It is such a shame that Andres can share his accomplishments, which he has represented in world conferences, and you still want proof.

After reading the first post and especially:

“I must say this language-prototype really works, its compiled under .NET 2.0 (is not an interpreter) and runs very fast! Has been talking to people over 8 million interactions in a few months, in LA.”

I was expecting a working example, yet when asked for an example can only provide an alpha version saying the programming is naive and only uses 1% of the potential…

I suspect the conference is just to share his ideas. Yes of course, it is up to him what he wants to do but once again, I see someone starting off yet another thread about how it does this and does that, yet have no actual example.


  [ # 17 ]
Gary Dubuque - Apr 26, 2011:


It is such a shame that Andres can share his accomplishments, which he has represented in world conferences, and you still want proof.

The problem here is the consumer, not the inventor.  As Andres puts it, why waste time creating the dog and pony show when there is real work to be done?  I agree with him. In the journey to create good conversation, he only has to prove his efforts to himself and his sponsors.

Thank you Andres for sharing.  Your work is quite fascinating!

I’m especially interested in the co-referencing, but more so in the NLG. I envision some day a bot being able to manufacture script from its internal models and a focus on the co-reference concepts in the whole conversation, even if the bot is only talking to itself.

Maybe one thing this group missed is that DAML is, in part, an effort to create an AI standard language, albeit framed as an agent markup instead of AIML which is a bot markup.  Hence your contribution to that thread which is now removed.

Thank you for consider my work a contribution th this community smile , and in this way give some value to my effort. But the other people are also right, you (I) have to prove the things, once created, even the concept must be proved before making the development.
I found the academic way, where there are a community of very well-instructed people trying to destroy your work to ban you out of the congress and conferences with very sharp & accurate arguments, much more than those emerging from forums, which are also useful, indeed, therefore I am posting here because I appreciate all your thoughts, despite they are good or not. Its like “to prepare for war"you must exercise in the best battlefield, and this (for me) is the academic one. Also I don’t say that quite good ideas DO emerge from forums, hence my long writings, to (may be) help someone to twist his thoughts towards better results with them or even while criticizing me.

One more thing to make a contribution, you mentioned NLG, and I jumped inside this fascinating world by accident a few years ago, ignoring that it even existed. I reinvented the wheel (such a waste of time) but luckily my invention gave me such a critic-capacity to observe the ‘state-of-the-art’ that I started to include in the project, some NLG starting from that points. The result is included inside my DDL language, which not only involver the decision to get an answer, but includes all the things to do with captured ‘things’ to be combined to make a reasonable response, (an elaborated one) not a pre-written like most of the AIML scripting. Also when I realized that the answers should not be “pre-written” I found a huge performance and practicability problem in that the people who write the scripts are not always programmers, and even if so, you cannot ask a normal programmer to develop each and every way to handle linguistics, or he must be an expert in linguistics also. So what was the solution: to make a scripting language to handle the outcome, but stop! What to say?, when?, the simple you-I toggling of Eliza and AIML don’t work at all but to fascinate dummies with the right question! The need of something else was arousing!
Also, if it is a script, the performance penalty for the parsing and a dynamic run-time are huge, and limiting (unless you want o re-invent a wheel like making a new javascript / ruby or python runtime motor)
I did something to overcome this, and this is a reinvention of the rules of computing with “words”, creating a language with only overloaded operators who operate on “ideas” extracted from accurately-parsed user “sayouts” or even created from scratch.
This language should be compiled, because I did’t want it to be inefficient, so I did the effort to create a real compiler for this section (and afterwards for all the language).
I gained experience in Parsing while I developed a GLR parser, completely re-built from the Java CUP LALR parser generator, indeed the runtime is completely new, and I incorporated a new concept of ‘conditional’ parsing to allow it to parse NP hard problems in linear time, under context. (this work is part of my these, in some time it will be unleashed).
But there was a problem too: ¿What language should the compiler target to?
Again… trying to avoid to re-invent the wheel, I used the .NET framework to create a high level language, in which I could add a big and extensible NLP runtime, (actually with over 120 functions, many of them Lingüistic), and let this platform’s compiler do the magic to convert this into optimized MSIL and then on runtime, (by means of the JIT) into highly optimized ML.

That’s all!.. and this works terrificly (the only penalty are the functions, not the scripting nor the dynamic linking, needed because you cannot compile all the scripts (over 40 modules for a normal silly-bot) ) all together, so you must do some tricks, to avoid using reflection, the golden pill of programmers, but a real pain-in-the-ass when speaking of performance.

On the other hand this has a downside, and this pops up when you realize what things you can do with “sentences"or “objects” like inflection them and ad articles that have the right correspondence (genitive an number).. oh I forgot you English-spoken people don’t deal with more than a simple article, but other languages certainly do! Even verb conjugation in Spanish is a real nightmare! (more than 50 irregular ways to conjugate a verb) there are about 7 single tenses, and 12 or more complex tenses, even the phrasal compositions are not English, the Spanish guys arranged to align often up to 3-4 verbs plus some other particle to state an action, and worse! this is quite often in common language, as well as the irregularities, they are the most often things, along with many more ambiguities.

Let me introduce some single things in Spanish:
“La” is a feminine determinative article, but is also a music “note” (noun, feminine)
“Mi” is a 1st person singular possessional pronoun, and also a musical “note” (noun, feminine)
English suffers a lot of ambiguities, (I know) but to disturb the programmer even more, we also have diacritics (many of them, on top of the vowels) and most of the people ‘forget them’ when typing. so for example “como” has about 7 ambiguous meanings! And even more! we do have complicated inflections, even nesting ones!

This is quite a complication when dealing with those “inflections” there I spent most of my time! When I constructed the anaphoric “bender” to replace pronouns and possessive in Spanish , It took 2 weeks to develop, but the English part took only a few hours!

In my time looking out for something worth on this facts, I saw no “bot” out there who can even do inflections (or at least a documented one), nor one who could interpret syntactical-semantic patterns to capture things out of the user text (other than single tagged-words constructs). (for sure I must have missed something) ¿Anyone knows one? - would be nice to compare features, and learn from it.

One more comment, some while ago, I also saw the 2009 last “Löebner” prize contests video, and please “man, this cannot be serious!”, so I don’t even want to participate there! Only when the contest bears a serious challenge, with statistical significant tests, not a Guinness-like one, only to fool some judges in a limited chat-time; then and only then may be I’ll participate. By saying this I don’t want to be disrespectful to those peoples who do believe in this contests, but it’s not for me at all. I’ll explain my fundamentals here:

I think the real challenge for a chatbot is not to entertain a human or fool him, but to prove useful, to do a task, something worth to his mere existence! I know that gaming and entertainment is also a goal for itself, but this will not be my goal!

So, now I am encouraged to make a platform for others to be able to easily build bots, including the necessary tools, functions and backbone piping to make a useful thing in less time, something which goal should be to answer or deduct the right thing, in the right sequence, not to return a “joke” or some “annoying stuff” to fool the user.
And this bot should always ask again if he don’t understand the human-construct, may be giving the human a hint due to his (bot) limited knowledge and guiding him(human) toward the use of his(bot) real power, like doing semantic-like searches, to construct logic constrains when looking for something in a incremental way, etc. Those are my concerns, and there is so much to do on this I cannot think on nothing else but building new blocks and models every other day. Here is an example of a typical and useful application:

I have just finished a Library Reference-Search agent module, using all the linguistic power inside the bots runtime only, did not make a bot interface yet, and I am surprised of the power of this functions on-duty, as an example: Looking for a author’s name or a subject or word which you only heard on the telephone, and is not a known one for you, may be sounding German or Spanish, you write it down as you heard it and ‘voila’ the system retrieves in a flash (few seconds) all the “authors” that sounds like what you wrote there, ordered by phonetic similarity and weighted spell/meaning differences, even if you misspelled the name and it’s not an author, it guesses the word, disassembles the inflection, goes for ontological replacements and then looks inside the indexes, making the same stuff there.
I tried this with 150k book-registers of about the half of the book titles edited in Argentina, and it works acceptably good, on the other hand, to make the same search with ‘brute force’ you need to make about 600-900 SQL queries to get most of the names, out, with my library you get it in only 5 or 6 SQL-queries done! (If you happen to try this with Soundex or Metaphone indexing, you will get none or a third of all registers, also unordered!)
My library uses many weird academic stuff, but it works fine, returning almost 5 to 20 books, for a simple search, sorted by some kind of relevance and pertinence, and the results are promising, believe me! (its not another Google page-rank, but it’s a beginning!)

Hoping to have contributed some meaningful thoughts of my goals to the community, wishing all for success on the bot-building challenge, I go back to my work!

And you all: please forgive me not to write clear-straight-polished English, my mother-language was German (thus my long sentences), then learned British-English (brutish) at school from age 5, and live in a Spanish talking country, even writing an academic paper for me is a hell of a job, belief me!

best regards



  [ # 18 ]


Probably one of the most similar projects of members on this site would be Victor Shulist’s GRACE/CLUES.

He is also an advocate of parsing understanding through linguistics.

I would also be interested in any documentation (user manual?) you might have in English, to get a better idea of what approaches you have taken.


  [ # 19 ]
Merlin - Apr 27, 2011:


Probably one of the most similar projects of members on this site would be Victor Shulist’s GRACE/CLUES.

He is also an advocate of parsing understanding through linguistics.

I would also be interested in any documentation (user manual?) you might have in English, to get a better idea of what approaches you have taken.

I will take a look there, thanks.
The manual is written in Spanish because the users in my country don’t speak English (sorry) and to translate the manual accurately (>125 pages in A4, Arial 10pt) is a huge task, may be when it is finished and there is a commercial need, I’ll do this. smile. also the manual is still under construction!
BTW: The manual don’t explain the algorithms nor the internal structure, only the surface, a very abstract one!. At this time I am not considering put this all into the public domain, only the academic breakthrough (if any) will be disclosed (as algorithms) but not code at this time..
best regards


  [ # 20 ]

Hello, I am happy to announce that my new dialog-module has been giving good testing outcomes, and I share the most daring sample I’ve got to work, here is a sample (sorry, again in Spanish)

¿ (subj:s_nom) -> quién hacer ?
+ text: karloz kezo manana koner
dT: 586,9 mS, Add-Phrase -> score: 0
=> ( [ (0-0)(subj:[email protected]*{Carlos}):Carlos ] [ (1-2)(odir:[email protected]*{queso}):queso ] [ (2-4)(cmode:[email protected]*{mañana}):mañana ] [ (3-1)(verb:x_ver#@*{comer}):comer ] )~1.2

=> [ subj:Carlos ] [ odir:queso ] [ cmode:mañana ] [ verb:comer ] ~1.2

(oind:o_ind) -> ¿Carlos come a/para qué?

Carlos come queso mañana

dT: 22,2 mS, FindBest -> score: 1,189504

I have been translating a little!
Here is a document to understand the actual dialog system (sorry, its rather commercial-oriented but a start in English)


 < 1 2
2 of 2
  login or register to react