AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Applied Problems, an AI evaluating standard for chatbot.
 
 
  [ # 31 ]
Victor Shulist - Sep 9, 2010:

To me, one of the most effective ways to measure intelligence is how much a bot/human can accept representations of things in non-standard ways and cope with them.

Agreed.

Victor Shulist - Sep 9, 2010:

Just because most text books follow a consistent way to represent math problems doesn’t mean it is not an effective way to test a chatbot’s abilities to handle those more complex and as you say diabolical examples.

Those complex, non-textbook cases are the ones we want to use to develop and judge our bots abilities and milestones.

Certainly a method for testing the extent of a bot’s grammatical training. One of many tools. And one that can be easily interpreted in terms of figuring out where a bot’s grammatical strengths and weaknesses are. All you have to do is look at the equations it generates!

Victor Shulist - Sep 9, 2010:

Also, don’t forgot that it may not be simple sentences, but rather an entire interactive dialog of user and the AI to just determine the problem itself, and clarifying questions asked by the AI.

I agree that the ability to recognize a problem and ask intelligent questions concerning it are definite hallmarks of intelligence.

Victor Shulist - Sep 9, 2010:

How far have you come with your bot so far, do you have any sample conversations, or any output samples of how it parses?  Not many members on this site are focusing too much on grammar on their bots.

I’m at the stage where I have a working parser and I have a working method for turning parsed sentences into a structured (factual) knowledge base. But there is definitely room for fine-tuning and improvement of the algorithms, especially for the parser. And there is infinite room for improving the knowledge base itself!

I’ll post some sample I/O once I’m done with some current parser fiddling. There is no “chat mode”, besides a completely naive one I made to test out my knowledge base. (It randomly spits out sentences related to one of the nouns in the users input sentence. Just for fun.) Interaction is through structured teaching modes. Maybe I’ll start a topic on it eventually.

Victor Shulist - Sep 9, 2010:

Also, does your bot generate many parse trees in the cases where many words can have multiple parts of speech?  Also, is one grammar rule applied to the whole sentence, or does the bot figure out the combination of grammar rules to apply to the entire sentence on its own?

It used to spew out many many parse trees and then try to discriminate between them. The discrimination took several stages. Some are hard-programmed rules about grammar. The grammar is also compared against known grammars, both for strict matching (high confidence) and for “fuzzy” matching (lower confidence). I say “fuzzy” because the matching makes allowances for variations in the number and placement of adjectives, adverbs and articles. Lastly, the grammar is ranked according to chunks of grammar found in other sentences (of length 3 or more POS). Basically, each known grammar is chunked into pieces of various length after it has been learned. The parser checks if any of these chunks are to be found in the unknown sentence. If so, confidence in the grammar guess rises.

I say it “used to” spew out many parse trees because this method took forever and a day. I’m messing around now with culling unlikely parses as the stages of parsing progress. I need to do more with this. Very much a work in progress. The current incarnation works much faster, but is far less accurate. We’ll see how it goes.

 

 
  [ # 32 ]

That is remarkably close to how my engine works.    The way I see it, there is really no way around the “parse tree explosion” issue.  Luckily I have been able to decrease the processing time considerably.  There was a certain number of words, depending on how many parts of speech those words could have, that the processing time would explode, I think it was around 20 or so words, where it took 10 minutes!!!  I have it down to about 5 seconds for 20 words right now.

For me, the most complex part of the grammar parse tree production was prepositional phrases.  The main problem being in determining what the antecedent was.  So what I do, and this may sound insane and perhaps is, but like I say, I do have the processing time down enormously, is I generate all parse trees, where each parse tree is a different option of what the antecedent of that prepositional phrase is modifying.  Then, the concept specifications come in and evaluate each giving a kind of ‘merit point’ to the ones which makes sense from a word association point of view.

You have to be the ‘administrator’, with password authentication) before my bot will ‘believe’ what it is told.  Any input given when you are logged in as admin, is accepted as FACT.  Any other input, when not logged in, or (later) when eventually I have it hooked up to the Internet, will be taken as-is, as simply a comment.

So, when you give a FACT to your bot, you store the selected parse tree (the one the bot ‘believes’ is the one that the user really meant).

Then, when you ask a question, you compare the parse tree of the English question and compare it against fact parse trees? Is that how you are doing it ?  I was going to do it that way, but I have since changed my mind.  I have what i call these ‘concept specifications’ which, after selecting the parse tree that I think the user really meant by his/her input, boil down to a fixed set of ‘internal standard representations’—perhaps something like your database.  I then search through those to find a response. 

However, and this is later down the road, perhaps next year, instead of matching a question to a fact, an external script that ‘deals with’ that type of question will be executed, of which a fact requirement of that script may need to find another script to provide the info, and on and on to any depth, will be used to derive an answer.

Looking forward to seeing some samples, you’re approach is strikingly similar to mine I think.

 

 
  [ # 33 ]
Victor Shulist - Sep 9, 2010:

Then, when you ask a question, you compare the parse tree of the English question and compare it against fact parse trees? Is that how you are doing it ?  I was going to do it that way, but I have since changed my mind.  I have what i call these ‘concept specifications’ which, after selecting the parse tree that I think the user really meant by his/her input, boil down to a fixed set of ‘internal standard representations’—perhaps something like your database.  I then search through those to find a response. 

However, and this is later down the road, perhaps next year, instead of matching a question to a fact, an external script that ‘deals with’ that type of question will be executed, of which a fact requirement of that script may need to find another script to provide the info, and on and on to any depth, will be used to derive an answer.

Looking forward to seeing some samples, you’re approach is strikingly similar to mine I think.

Interesting. It seems that we three on a similar way. And I believe we are on the right way.

 

 
  [ # 34 ]

Agreed !

There are different levels of intelligence and different types, but I don’t think you can go wrong with centering your focus on language to develop a very powerful and useful bot !

 

 
  [ # 35 ]
Victor Shulist - Sep 9, 2010:

That is remarkably close to how my engine works.    The way I see it, there is really no way around the “parse tree explosion” issue.  Luckily I have been able to decrease the processing time considerably.  There was a certain number of words, depending on how many parts of speech those words could have, that the processing time would explode, I think it was around 20 or so words, where it took 10 minutes!!!  I have it down to about 5 seconds for 20 words right now.

Ah, you’re living the dream. smile

Victor Shulist - Sep 9, 2010:

For me, the most complex part of the grammar parse tree production was prepositional phrases.  The main problem being in determining what the antecedent was.  So what I do, and this may sound insane and perhaps is, but like I say, I do have the processing time down enormously, is I generate all parse trees, where each parse tree is a different option of what the antecedent of that prepositional phrase is modifying.  Then, the concept specifications come in and evaluate each giving a kind of ‘merit point’ to the ones which makes sense from a word association point of view.

I’ve divided into two separate components (perhaps artificially) (1) the process of determining each word’s pos and thus the overall grammar of the sentence and (2) determining more complex concepts like the antecedent of a prepositional phrase. The second is identified at the stage when the sentence is added to the knowledge base. The reason for this is chronological: I began the parser long before I had constructed a way to store/access knowledge!

I think my knowledge base might be more akin to your parse trees. (It’s similar to a set of parse trees, but with branches connected to other nodes. Turns the whole thing into a webby mess. I love it.)

Currently my rules for governing prepositional phrases are all based on hard-coded grammar rules, which works well so far. But there are many ambiguous cases (your elephant in pajamas is a good example) for which the knowledge base should be tapped for interpretation clues.

The most bothersome aspect of complex sentences for me right now are conjunctions. My knowledge base works as a set of nodes (nouns), all pointing at each other, where the rest of the sentence specifies the “arrow”. But there is only one direct object (at most) for each arrow, so sentences must be broken down into simple sentences before they can be learned. I’ve got a method for templating complex sentences to a set of simple sentences, where the template includes special characters that define the rule for how each simple sentence relates to the others. I’ve also got a few simple rules hard wired.

Victor Shulist - Sep 9, 2010:

You have to be the ‘administrator’, with password authentication) before my bot will ‘believe’ what it is told.  Any input given when you are logged in as admin, is accepted as FACT.  Any other input, when not logged in, or (later) when eventually I have it hooked up to the Internet, will be taken as-is, as simply a comment.

I have a confidence rating (0-1) for every entry in the knowledge base. So far, as I’m the only one doing the teaching, all confidences are set to 1. (I’m always right wink )

Victor Shulist - Sep 9, 2010:

Then, when you ask a question, you compare the parse tree of the English question and compare it against fact parse trees? Is that how you are doing it ?

I don’t ask questions yet. But I don’t plan to do it this way. I’m thinking something more along the line of comparing question words (how, when, why, ...) with what types of nouns, verbs, adverbs, etc. answer them. This can be learned from conversation as well as hard coded. I’ve got some other ideas as well, but they are ill-defined. I think my approach will become clearer as I improve my knowledge base.

Victor Shulist - Sep 9, 2010:

I was going to do it that way, but I have since changed my mind.  I have what i call these ‘concept specifications’ which, after selecting the parse tree that I think the user really meant by his/her input, boil down to a fixed set of ‘internal standard representations’—perhaps something like your database.  I then search through those to find a response.

I think I’m using WordNet’s synonym tools as the equivalent to your “internal standard representations”.

Victor Shulist - Sep 9, 2010:

However, and this is later down the road, perhaps next year, instead of matching a question to a fact, an external script that ‘deals with’ that type of question will be executed, of which a fact requirement of that script may need to find another script to provide the info, and on and on to any depth, will be used to derive an answer.

That would indeed be cool. After I’m done with my parser upgrade, I plan to learn parallel python. I think it’ll speed up the parser code, and work well for problems like answering questions. Many different approaches can be utilized side-by-side for later comparison and answer selection.

Victor Shulist - Sep 9, 2010:

Looking forward to seeing some samples, you’re approach is strikingly similar to mine I think.

Thanks! I’ll definitely be posting a thread about my project after my current round of improvements are finished.

 

 
  [ # 36 ]
C R Hunt - Sep 9, 2010:
Victor Shulist - Sep 9, 2010:

…...  I have it down to about 5 seconds for 20 words right now.

Ah, you’re living the dream. smile

Yes, well, an UNBELEIVABLE amount of long hours at it !

C R Hunt - Sep 9, 2010:

I’ve divided into two separate components (perhaps artificially) (1) the process of determining each word’s pos and thus the overall grammar of the sentence and (2) determining more complex concepts like the antecedent of a prepositional phrase. The second is identified at the stage when the sentence is added to the knowledge base. The reason for this is chronological: I began the parser long before I had constructed a way to store/access knowledge!

Interesting, right now I have 3 stages : Stage 1) Morphology, like relationships of words like believable, unbelievable, unbelievably, also knowing the participle part, and tenses of verbs.  Stage 2) POS tagging and parse tree production.  Stage 3) Concept Instantiation from specifications.  And I have started a bit on : Stage 4) Reactors (once you know the concept of user input, do something with it, compare with existing knowledge), Stage 5) output sentence instantiation.  I think for stage 5 I’ll have a kind of ‘internal result’ which will be mapped to one of a selection of sentence templates that it can pick from to express itself, and pick from some synonyms.

C R Hunt - Sep 9, 2010:

I think my knowledge base might be more akin to your parse trees. (It’s similar to a set of parse trees, but with branches connected to other nodes. Turns the whole thing into a webby mess. I love it.)

LOL, you have to love complexity to work in this stuff !!!

C R Hunt - Sep 9, 2010:

But there are many ambiguous cases (your elephant in pajamas is a good example) for which the knowledge base should be tapped for interpretation clues.

I have this working, it wasn’t easy.  It finds a rule, and uses it (that in general people wear clothes, and animals, generally don’t.)

C R Hunt - Sep 9, 2010:

But there is only one direct object (at most) for each arrow, so sentences must be broken down into simple sentences before they can be learned. I’ve got a method for templating complex sentences to a set of simple sentences, where the template includes special characters that define the rule for how each simple sentence relates to the others. I’ve also got a few simple rules hard wired.

My engine right now can take one or more direct objects, one or more indirect objects, and even know if the list is an OR or AND type (if the noun list is AND-ed together, OR-ed together or AND+OR’ed,  which it calls ‘complex’ – meaning it knows it has to ask for clarification).  So variable number of nouns for indirect/direct object.  Also each object can have zero or more adjective modifiers, and each and all those adjectives can have zero or more adverbs modifying them.  Also, sometimes the object of one prepositional phrase is the antecedent of another prepositional phrase, that is also currently handled. smile  As in, “I went to the store at the end of town”.

C R Hunt - Sep 9, 2010:

I have a confidence rating (0-1) for every entry in the knowledge base. So far, as I’m the only one doing the teaching, all confidences are set to 1. (I’m always right wink )

Nice smile

C R Hunt - Sep 9, 2010:
Victor Shulist - Sep 9, 2010:

However, and this is later down the road, perhaps next year, instead of matching a question to a fact, an external script that ‘deals with’ that type of question will be executed, of which a fact requirement of that script may need to find another script to provide the info, and on and on to any depth, will be used to derive an answer.

That would indeed be cool. After I’m done with my parser upgrade, I plan to learn parallel python. I think it’ll speed up the parser code, and work well for problems like answering questions. Many different approaches can be utilized side-by-side for later comparison and answer selection.

Yes, and the very cool part of this is, just as my engine figures out the tree structure of grammar rules to apply to a sentence on its own, it will also determine the tree structure of scripts to call in order to determine a response.  An idea I have, and I have NO IDEA if this is asking too much, but I want the engine to read the comments, that is, the purpose of a given script in natural language, and thus know what it is used for, and then assemble its own program which will allow it to figure out on its own how to determine an answer…. if it works, it could be a real surprise factor in its responses !

DAMN I love this website and this research !  So many cool people working so hard on this 50+ year problem… but I’m becoming addicted to it!! 

Anyway, maybe somebody soon our bots will talk to each other.  It will be awhile before I put mine up on the net though.  Perhaps end of next year though.    I’m hoping a video on you tube in the spring perhaps.

 

 
  [ # 37 ]
Victor Shulist - Sep 10, 2010:

Interesting, right now I have 3 stages : Stage 1) Morphology, like relationships of words like believable, unbelievable, unbelievably, also knowing the participle part, and tenses of verbs.  Stage 2) POS tagging and parse tree production.  Stage 3) Concept Instantiation from specifications.

Interesting stage progression. I planned to save morphology for answering questions. That is, deciding if a learned fact answers a particular question by checking synonyms and morphology against what the question is asking for. Verb tenses are of course a necessary part of pos tagging.

The next big stage I want to work on (after my parser issues, and after I’ve cemented the format of my knowledge base…phew!) is combining facts into lists ordered in space, time, etc. I’ll call these orderings “stories”, though the story may be no more interesting than “pouring a glass of milk”. Each entry in the story will include a part of the process or event the story describes. Stories can even contain other stories as entries.

Victor Shulist - Sep 10, 2010:

And I have started a bit on : Stage 4) Reactors (once you know the concept of user input, do something with it, compare with existing knowledge), Stage 5) output sentence instantiation.  I think for stage 5 I’ll have a kind of ‘internal result’ which will be mapped to one of a selection of sentence templates that it can pick from to express itself, and pick from some synonyms.

I plan to use all the grammar and text templates and snippets I’m generating for the parser and use them to format the bot’s output. This process will probably also include hard grammar rules as well.

Victor Shulist - Sep 10, 2010:

I have this working, it wasn’t easy.  It finds a rule, and uses it (that in general people wear clothes, and animals, generally don’t.)

I don’t think the fruits of this type of labor would be evident until I have a large knowledge base to access anyway. Sort of a Catch-22. Need the knowledge base to form the parses, need the parses to generate the base!

Victor Shulist - Sep 10, 2010:

Also, sometimes the object of one prepositional phrase is the antecedent of another prepositional phrase, that is also currently handled. smile  As in, “I went to the store at the end of town”.

This is a perfect example of a problem I’ve been thinking about: what is an acceptable degree of muddiness in interpretation? Sure, the sentence you provided is meant to imply that the store is at the end of town, but it is also true to say that you are also going to the end of town. The sentence “I went at the end of town” may be somewhat poorly formed, but no worse than the type of mistake non-native speakers make all the time. (Why are prepositions always so difficult when learning a foreign language?)

Then again it is still wrong. And unless the AI does a lot of internal inferring, it may not realize that by simultaneously going “to the store” and going “at the end of town”, it is implied that the store must be located at the end of town.

Victor Shulist - Sep 10, 2010:

An idea I have, and I have NO IDEA if this is asking too much, but I want the engine to read the comments, that is, the purpose of a given script in natural language, and thus know what it is used for, and then assemble its own program which will allow it to figure out on its own how to determine an answer…. if it works, it could be a real surprise factor in its responses !

I think what your asking for would be a revolution in programming as well as AI. It’s good to have dreams, but I think that’s a long ways away. Would be awesome though.

Victor Shulist - Sep 10, 2010:

DAMN I love this website and this research !  So many cool people working so hard on this 50+ year problem… but I’m becoming addicted to it!!

Yup, it’s been a huge amount of fun going through the topics here. Definitely gotten the gears turning.

Victor Shulist - Sep 10, 2010:

Anyway, maybe somebody soon our bots will talk to each other.  It will be awhile before I put mine up on the net though.  Perhaps end of next year though.    I’m hoping a video on you tube in the spring perhaps.

Could be a recipe for trouble, but it’d be fun to see. smile I won’t have a proper chat mode for a while—too many interesting goals in the way—but maybe in a year or so I’ll be able to cobble something together.

 

 
  [ # 38 ]

@Victor and CR

You folks are doing an absolutely wonderful job of pushing me well out of both my skillset, and my comfort zone, and I LOVE it! Right now, this discussion is so far over my head that I feel like I’m drowning, but in this case, it’s a GOOD thing. smile Keep it up, and I’ll try my utmost to follow along, though keeping up, at this point, is all but impossible for me. Nicely done!

Oh, and CR, do you have any objections to allowing us to attach a more appropriate name to you than CR? Initials seem so impersonal to me.

 

 
  [ # 39 ]

@CR - Another example from some of the tests with my CLUES engine is: 

“I talked to Sally in Toronto”

This could mean a) While I was in Toronto, I talked to Sally.

OR

There could be more than one Sally I know, one lives in Ottawa, one in Toronto.

Then if I say, “I talked to Sally in Toronto”, and putting emphasis on “in Toronto” it becomes clear - perhaps I was just on the phone with her, and I wasn’t necessarily in Toronto when I talked to her,  then the antecedent is Sally and not talked. 

I was interested to know if you are doing like I am, simply generating all parse trees where, when unsure, have a given prepositional phrase’s antecedent be all the grammatical possibilities it can be..  Then, a later stage (for me, stage 3, concept specs), figure it out.

Also, have you thought about multiple word words?  Example, “John F Kennedy was a good man”.  Does your bot know the entire string “John F Kennedy” as the subject noun?

This weekend I will be teaching CLUES about noun clauses, such as “What I had for breakfast gave me heartburn.”, where the subject is the clause ‘What I had for breakfast” ?

  I have already done this, but when I re-coded the engine (to decrease processing time from 10+ minutes to 5-6 seconds) I changed so much that the old rules are in a different format, which requires re-writing the grammar rules. 

I was wondering how you are tackling the noun clause thing.

Another thing is participles, example “The road was full of litter, thrown from the window”. 
“thrown from the window”, is your bot seeing that as a past participle phrase modifying ‘litter’?

@Dave - yes, fun isn’t it !  This site is getting so many new members I’m spending **WAY** too much time on it !!

 

 
  [ # 40 ]

Wow, thanks for all the questions. It’s really forcing me to think back through all my code and re-build the big picture in my mind. I love being able to discuss my project with a fellow bot developer, especially someone working on the very problems I’m encountering. This forum is great!

Victor Shulist - Sep 10, 2010:

I was interested to know if you are doing like I am, simply generating all parse trees where, when unsure, have a given prepositional phrase’s antecedent be all the grammatical possibilities it can be..  Then, a later stage (for me, stage 3, concept specs), figure it out.

Right now, my code would label both “to Sally” and “in Toronto” as “conditions” on the sentence “I talk”. Basically, one condition would be “in” pointing to the sentence “Toronto be” and another condition would be “to” pointing to the sentence “Sally be”. (Note that since this is a factual knowledge base, verbs are all stored in their infinitive form, with the implication being that everything in the knowledge base is a possible event. Time information arising from verb form or elsewhere would be codified at the level of the stories I described before.)

So you could say that both “to Sally” and “in Toronto” refer back to “I”, since it *would* be possible to have the knowledge stored in such a way that there is one condition on “I talk”, which is “to” pointing to “Sally be” which has its own condition “in” pointing to “Toronto be”.

Phew, I hope that makes sense.

Victor Shulist - Sep 10, 2010:

Also, have you thought about multiple word words?  Example, “John F Kennedy was a good man”.  Does your bot know the entire string “John F Kennedy” as the subject noun?

Most of my program can handle noun phrases, but right now the knowledge base frowns at them. This is mostly do to my own corner-cutting—I wanted to get a version up and running to test some of the other aspects of my storage system. However, the plan has been to take recognized noun phrases and effectively turn them into one word (with pos noun) using underscores between the words and sort of hack the problem that way. I’m not sure how robust this will be, or how much trouble it might cause me later when I’m trying to query the knowledge base. We shall see.

Victor Shulist - Sep 10, 2010:

Another thing is participles, example “The road was full of litter, thrown from the window”. 
“thrown from the window”, is your bot seeing that as a past participle phrase modifying ‘litter’?

All complex sentences (not of the form subject verb object + any prepositional phrases) are handled through rules that have to be learned through examples.

The bot would ask for a simplified form of the sentence “The road was full of litter, thrown from the window.” I would tell it “The road was full of litter. Litter had been thrown from the window.”

Then, the bot would set to work trying to build a mapping rule between the complex sentence and the simple ones. It might look something like this,

” %%n0 was %%v0 of %%n1 , %%v1 from %%n2 .” <—complex sentence form
” %%n0 was %%v0 of %%n1 . %%n1 had been %%v1 from %%n2 . ” <—simple form

In the future, if the bot encounters a sentence of that complex form, it will pattern it into the simple form.

 

 
  [ # 41 ]
Dave Morton - Sep 10, 2010:

Oh, and CR, do you have any objections to allowing us to attach a more appropriate name to you than CR? Initials seem so impersonal to me.

I don’t mind you knowing my name, but I’d rather be referred to as CR, Hunt, or otherwise. The reason is a Google one. In science, your name is something like your brand. If someone googles my name, I’d rather my physics work pop up on search results than my chatbot work, lol. Perhaps I’m being paranoid, but there it is.

My name is the same as a resident of Troy that predicted a certain wooden horse would bring bad times. How’s that? smile

 

 
  [ # 42 ]

lol, that sounds fine, if a bit complex, so how about Fred, instead? :D

Seriously, I understand the reasoning, although I can’t help but think that your chatbot R&D can’t be anything other than a resume enhancement. I don’t actually mind using CR to address you. In fact, while in high school, I insisted on being addressed as DJ (it beat the heck out of being called “little Dave” at the time). It’s only been in more recent decades that I’ve began to feel that the use of initials seemed, while not exactly formal, certainly less personal. smile

 

 
  [ # 43 ]

He he, Fred is fine by me smile

 

 
  [ # 44 ]

CR, have you bumped into the apostrophe problem yet ?

that is,  does

    X’s Y

mean

      X is Y,  as in “Bob’s nice” = “Bob is nice”

or

    the owner of Y is X, in “Bob’s car”, owner of car is Bob.

 

 
  [ # 45 ]

Right now I assume all apostrophes are possessives unless (1) there is no verb in the sentence or (2) interpretation as “is” or “has” is necessary for the verb tense to make sense. How do you handle it?

 

 < 1 2 3 4 > 
3 of 4
 
  login or register to react