AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Chatbot Design: Assigning/Detecting Gender
 
 
  [ # 16 ]

Dave—rant OVER? No No, I APPLAUD your rant !!! ENCORE ! ENCORE!  More people need to hear rants like that!

two words for you—MODERN SOCIETY.    Get it done… get it out the door… as long as it basically works.  Send a quick and dirty email, quick and dirty cell phone text… who cares.. just get it done, who cares about quality.

Same reason we have products that last, what, 2 months, instead of years like they did before.

I agree with your comments, Merlin and Laura— G. I. G. O.  If your audience is some lazy person that doesn’t care…then why try to cater to them?  Even if you DO deal with their garbage, and produce a quality response.. if they are that type of person they won’t appreciate it anyway.  Myself, I’m not targeting those people as Grace’s users.  I am targeting people that want to have a quality discussion smile  Yet another reason why I don’t care about the LP.  The LP doesn’t encourage incremental progress with bots.  You have to go from zero to human level conversation skills and handling all the stupid stuff.. all in one shot.  This is casing many people to develop simple algorithms, and “hoping against hope” and crossing their fingers…maybe this thing will learn just from these simple algorithms….not going to happen in my opinion, language is much too difficult.

Now, for sure, we need to support common grammar as Dave suggests.  Grace will fully support things like lol, and “ur” for “you are”.  She already supports “cuz” for “because”.  She will even deal with “your / you’re” and “too / to /two”.  She will tell from context, example if you say “I ate to hotdogs”, then “to” probably means “two”.  They aren’t too-too bad.  But extremely bad grammar,  I don’t know, what is the point of the machine trying billions of combinations to guess the closest correct grammar?  Well, if it was a “pay for time used” type of system, who cares, they lazier they are, the more they’d pay.  And if it is a free service, well, I don’t recommend having a free online chatbot have all of its CPU compute cycles tied up in trying billions of guesses , while denying service to others because of it.

As for hijacking my thread.. I’ve done it so many times to others, all of you are “pre-paid” for hijacking mine every now and then!!

 

 
  [ # 17 ]

..... on the other hand, supporting things like to/two/too without even indicating to the user, seems to me to be PROMOTING IT!!  Perhaps support it, but do NOT have an option to turn off the feature that says, every time :

By “I ate to hotdogs”,  I assume you meant “I ate two hotdogs”—is this correct? (Y/N)  smile

Also, the only time we could support errors like to/two/too and your/you’re is when there is only one valid guess.  If two guesses have equal semantic probability, then what? well then, you must ask the user.  If the user is annoyed by the constant questions, let him/her use better grammar (not perfect grammar, hardly anyone uses perfect grammar, agreed…. everyone of us probably has an error at least every 20-30 words ....but better grammar).

Oh.. CR - yes good points.  How long ?  hard to tell… I have a BOAT LOAD of work to do, however, yes, it won’t be difficult for Grace to associate why Dave wanted to wear is VERY VERY best sexy outfit to met that lady of his dreams smile  sorry Dave… you’re BUSTED !!!

Doing the association of why Dave wants to wear his finest threads will be easier than understanding the statement.  I think NLU is much more difficult than basic cause/effect reasoning.  We have had extremely powerful reasoning systems for decades.. Prolog, Lisp, you name it.  It is NLU that is the missing link.  Combine flexible NLU where we can understand the real world, with those powerful inference engines…. WATCH OUT !!!!!!!

 

 
  [ # 18 ]

Don’t know why but “she” may refer to a ship grin

In fiction, even as simple as a children’s story, not only elephant can run in pajamas but also the baby pig can have ham or bacon for breakfast!  grin

 

 
  [ # 19 ]

Or:

“Jack lives with Jill. Jack went to her closet….”

This would indicate that Jack went to Jill’s closet rather than Jack was a female.

 

 
  [ # 20 ]

Good points.

These points illustrate the need for a ‘many factor confidence formula’ to understanding input.

Let’s take each example, of the 3 (2 for John, 1 for Steve).

Ok.  John.  Point # 1 ‘she’ being ‘ship’.

Sure.  No problem.  Apply context.  If there is no mention of a ship in the conversation so far, I don’t know who would assume she was a ship, say if the conversation was:

user-  I love my wife.
ai- great
user - She is great.

I don’t know how there would be justification to assume ‘she’ was a ship here.  (I didn’t marry a ship).

So yes, she can be ship, but you have to factor in the STATE and HISTORY of the conversation.  That would be a variable in the confidence formula (see my GRACE/CLUES thread).

Also, you would take into consideration what
so yes, in a children story (or someone on LSD), elephants could wear pajamas, but ordinary human beings first apply a kind of default “ordinary world” context to understanding language, *then* apply *context-specific* rules.    So, your DEFAULT would be ‘ordinary world’.  But if source of input is children book, or someone on drugs, then you can expand your allowable range, subject to other context like what was already said in the conversation. 

Point 2

For example, if the user entered, a simple, non-ambiguous statement : “The elephant was in pajamas”.
The bot would have no choice but to accept it.
Why, because there is no ambiguity.  The PP “in pajamas” can only modify elephant or was.  In both cases I think they mean the same thing.

—————————————

Point 3 - Steve:

“Jack lives with Jill. Jack went to her closet….”

sure, a human would have to ask to clarify also.  So why concern ourselves with the fact that a bot would have trouble with that?

If the source of the conversation is a normal person, and the context is normal circumstances, and you are faced with a choice:

1) go with the parse that involves a person in pajamas -or-
2) go with the parse that involves an elephant in pajamas.

then, BY DEFAULT, you will pick normal world knowledge.  BUT, yes, if your source is a children’s story, or someone on weird drugs, then you would expand your allowed possible interpretations.

Also, if the conversation already involved direct statement like “My pet elephant wears pajamas” which is a direct, non ambiguous statement, then the weight assigned to possibly #2 would increase.

it is all about many factors in your confidence formula—conversation state, history , source of statements, many things smile

For anaphoric resolution, the more important steps are:

1) determine if we need to resolve ‘he’, ‘she’, ‘it’ etc
2) determine the set of possible antecedents.
3) gather ‘evidence’ for each candidate antecedent.
4) pick the most likely antecedent—based on most evidence.

if you fail, who cares!  Most of the time you are going to be correct. If you are waiting for perfection, especially with something as complex as language, you are never going to reach it !  The user can correct through conversation. Or the bot can ask.

Just in the time I read your post to the time I responded I was in a conversation with someone and I didn’t know what they meant by “it” in a conversation… so I simply asked! 
So have my language skills failed? i don’t think so, I have been employed for over 15 years at this same company and nobody has complained about my language skills.. Even though I need to ask to clarify things sometimes.. like everyone.. human and bot!

Why get stuck endlessly with a bot… the bot can calculate its best guess, like humans do… and if wrong, ask, or be corrected… move on!!!!!  Both humans and bots will get stuck or make incorrect guesses…why care? they can simply ask ...  just like people do.

By the way, “she” perhaps refers to any of a million acronyms perhaps…. and on and on we go….. no, we assume “default” context unless we have more detailed context to over-ride.

 

 
  [ # 21 ]

Good solutions Victor, I can’t find a better one. But, for “I love Jenny, she sails very well.” How can you apply context to know if Jenny is a girl sailor or a boat?

Also, she/her/he/him is not as difficult as “it”, which can be climate, time, baby(human), animal(not pet anymore, now people like to use he/she for their dog/cat), plant, a lifeless thing, a process/event or the whole sentence mentioned in previous line. Any good solution? Mine is just to check all options and eliminate the impossible. Very hard and slow, I must say.

For the fiction, wearing pyjamas is not very uncommon, but a pig eating bacon is much harder, will it raise a moral issue if a child is smart enough to realize that this is a perfect case of cannibal? I remotely remembered the book is about a baby pig going to school and learning social manners. In the morning she (or it?) has bacon and milk and says “Good bye” to mom…everything looked nice till my child asked the question, “is it a good for a pig to eat its own meat?” Now the point is, when everything is possible in a fiction, how and when to honour some rules while compromise others?

 

 
  [ # 22 ]

“I love Jenny, she sails very well.”

OK, so if there was no mentioned at all in any of your previous conversations about Jenny.  The name Jenny never came up before.  And the bot only new that Jenny was a first name of a female, but didn’t know any speciic instance of Jenny of a person.

And this is the first instance of it , in this sentence.

THEN….since she can be a ship, and ships sail, I for one, sure won’t be mad at my bot if it just concluded Jenny was a ship.

Again, if it is 50 - 50 chance, and a human would have the same issue, then there is no issue as far as I am concerned.  If a human were to have to ask the question, or go ahead and make an assumption, then why say an AI is somehow missing some ability beause it has to do the same?

Ask yourself—do you usually . ..and that is the key concept here, *USUALLY*.... Do people *usually* talk about people sailing? or do you *usually* speak of ships sailing?

Hypothesis 1
—————————-
Jenny is human

Evidence for this hypothesis : 
—————————————————————
1. she usually refers to a person (but sometimes a ship), but usually a person


Hypothesis 2
—————————-
Jenny is a ship

Evidence for this hypothesis : 
—————————————————————
1. she *can* refer to a ship.
2. usually *ships* sail.

two peices for hypothesis 2, so it wins.

NOW… very small point here.  Do people sail?  NOT EXACTLY.  People sail SHIPS.  That is, people don’t use their own bodies to sail..  they sail SHIPS.
If it was “I love Sally, she sails Jenny very well”—- ok people, don’t put your mind in the gutter here lol.
You get the idea. 

Ships sail. 

People SAIL SHIPS. 

Small differece, but it matters.


As for your comment on fiction.  Basically, you can’t.

So if you had something crazy like “the egg jumped over the table” in a children’s book, no , you can’t deal with that in this way.

However, and this is at the stage I am nowwith my bot, is, if the parse can only be one way…. you don’t have to promote some trees over others.  The egg jumped over the table, pure and simple…. when some trees don’t have to be promoted over others, you have to accept the grammar as being reality.

“over the table”  can be applied to

1) verb jumped
-or-
2) egg

and, since we often speak of jumping OVER something, that is the only promoting rule we have, so we go with 1).

so basic grammar, and that little little peice of knowledge, (that we usually jump OVER something), is pretty much all we have to go on.  So we use it.  Weird, but the egg MUST HAVE jumped over the table smile

Humans use common sense to interrept languge.  If I walked into work and said to a cooworker , the egg jumped over the table, they of course wouldn’t accept it.  First thing they would ask is, is this some kind of coded message?   

 

 

 
  [ # 23 ]

I agree with you on the solution using “Hypothesis” (mine is “option”) but don’t think statistics is a good tool in evaluation. How to define “USUALLY”? A corpus/samples with millions of line? Even with that the result is always a “most likely” one, a best guess according to a knowledge base which is inevitably limited to a number of chosen domains (many concept conflict will happen if you want to make a “general” kb by merging several specific kb, agree?) Guessing is fine in free chatting but will be harmful in real business. No matter how low a score the hypothesis or option has, it may be the case. Context referral is helpful but wherever there’s a lot of context there’s also a lot of omission to handle, another pain, yet I think it’s still better than statistical score. So the best and most human-like solution, like many have pointed out, is to let bot ask human to clarify. The checking of hypothesis or option is necessary because it justifies the question by indicating the bot has done the homework. Any better idea than context+ask?

For the fiction, I can’t agree more that a bot by itself cannot figure out how to apply some rules and ignore others. Even human himself (herself?) has this problem. Accepting the input as a fact is a good solution for your example, but if the 1st sentence is not that certain, e.g. “the egg jumped over the table, do you think it is possible?” or “the egg jumped over the table, how can it be possible?” then how can the bot accept anything as a fact? Interestingly enough, if it goes like “the egg jumped over the table, how is it possible?” or “the egg jumped over the table, have you ever heard of that?” then the bot actually CAN accept it as a fact! But how can the bot get this kind of subtle implication?  grin 

Even more interestingly, starting with that kind of question, a “reasonable” non-fiction story may follow and no physical rules need to be compromised…a simple one would be:
A: “the egg jumped over the table, have you ever heard of that?”
B: “no, never heard of that. Where do you get this?”
A: “Victor told me.”
B: “who’s Victor?”

Now here comes the options:

Option 1: keep open for both reasonable and unreasonable
A: “a bot master.”
B: “really? Maybe we should ask him to explain to us…”

Options 2: mark as unreasonable
A: “a bot.”
B: “ha! how can you trust a bot?!”

Options 3: make as reasonable or accept as a fact
A: “an author specialized in children’s story.”
B: “oh that must be an interesting one! then what happened after the egg jumped over the table?”

I confess this is not a good example since it is not really a story but a dialogue. Let me try a story:

The egg jumped over the table, how is it possible? Well, it’s a long story. Once upon a time there’s a girl lived in a far far away land called “Far Far Away”, her father was a farmer and her mother was also a farmer. They raised a lot of hens to bear a lot of eggs. Each day the girl went to the city market to sell the eggs. In the city there’s a candy shop and at the door of the shop there’s a small table. When the girl passed by the shop,

Now here comes the options:

Option 1: reasonable
a boy shouted out, “look! the egg jumps over the table!” when she turned around another boy stole an egg from the basket and both boys run away. The girl told her parents about the two naughty boys and never mention the jumping egg, and the bot doesn’t need to worry about it, either, everything is still reasonable even with “jumping egg” in the text.

Option 2: unreasonable
she saw one of the egg jumped out the basket and over the table and then turned into a butterfly!
Now the bot accepts this as a fact then what to do with all the previous reasonable facts?

Option 3: both reasonable and unreasonable
the boss of the candy shop shouted out “look! the egg jumps over the table and disappears!” when she turned around he smiled and gave her a candy and said, “when you come back tomorrow a chicken will come out from under the magic table!” the girl smiled too and when she came home she told her parents, “the egg jumps over the table!” and all of them smiled because the boss of the candy shop was a nice man. Now will the bot accept the jumping egg as a fact or a friendly joke?

 

 

 
  [ # 24 ]

I just wanted to share something interesting that happened this morning while testing Marie. As most of you know Marie is designed to be an internet assistant. Her processing priorities are in the following order:

1) Task Oriented Such as search, date, time, weather information and so on.

2) Simple Conversational Regular expression replacement replies such as, ^I (?:I could |I was able to )(.*)\.

3) Advanced Conversational NLP and word associative responses. Also subject and topics recall.

Since Marie is built mostly in JavaScript so her processing path is linear. If I asked her to search for information about the Beatles, her first action would be to access the Google API to process my request, but this happened this morning instead;

Laura: I am looking for information about the Beatles

Marie: Why are you interested in the Beatles?

Laura: I need search results for the Beatles

Marie: Why did you not answer my last question Laura?

Laura: Okay then, because I like their music

Marie: That sounds like a good answer. I am getting your search results for the Beatles music.

Technically I know how this may have happened but logically I am confused since her processing is not designed to work this way, at least not yet. I know this sounds crazy, but it is like she is evolving on her own.

Is she possibly becoming self aware?
(I am not being serious with that last statement!) LOL

 

 
  [ # 25 ]

As for fantasy… like an elephant wearing pajamas.

How about this…

—-start of conversation—-

DEFAULT : normal world

“While I was in Africa I shot an elephant in my pajamas”

assumption - in normal, default world view, I is person, person wears pajamas, elephants don’t.

*BUT*

reading a children book . .

changes the default ...

Start reading….

“Bobby the elephant lives in pajamas”

I know, crazy, but lets say for fun that was in a children book.

then comes.. ...

“While I was in Africa I shot an elephant in my pajamas”

AI reply :  “Oh No !!  You didn’t shoot Bobby did you????!!!”

Basically, again, context.

another thing is with children’s books—- pictures.  If you see an elephant in pajamas, then read it, well then you know.

Until your bot gets visual recognition, I suggest tell the bot directly “for this conversation only, assume the elephant is in pajamas” ... done smile

If Grace encounters a statement that has no ambiguity, she will accept it, even if there is no world knowledge to confirm its plausibility. 

Well, that is, if you are logged in as administrator.  In that case, what you say goes smile

Log in as admin….

input fact1: Bob is an elephant
input fact2: Bob lives in pajamas

As far as Grace is concerned, she will NOT have to ask to clarify what these 2 sentences mean.

She will take it for truth that this specific elephant bob lives in pajamas.

In that case…. if the statement

While I was in Africa, I shot an elephant in my pajamas

comes up..

well.. since the knowledge of Bob is fresh in her mind… that is, since we just mentioned Bob the elephant, it will “weight heaviliy” into her confidence formula that it was perhaps Bob that was in my pajamas.

But, when I log out, or when someone else talks to her, and enters

While I was in Africa, I shot an elephant in my pajamas

non-children , non-fantasy mode will be off.. it will be default “basic reality” rules , and she will assume I was in pajamas, not the elephant.

Also, I don’t think we need millions of lines of corpus and statistical analysis to just directly tell our bots that when you have a choice between deciding if a person is wearing clothes or an animal, go ahead and pick the human.  UNLESS. . like above, the context of the State of the conversation over-rides.

If there is a 99% chance, I’ll go ahead with the 99%.  LIke you say, yes, that 1 % could be correct one, but does it really matter?

I mean seriously, every time I drive to work…  there is a very good chance that a commet didn’t come from space and destroy the building I work in… yes, that 0.0000001 or whatever chance “could be correct”... but I will play the odds.

People think this way, we use assumptions all the time.  But for some reason, when we want to create a bot, we go in this endless cycle of wanting it to be perfect all the time?  It’s a waste of time.. just have the bot either guess sometimes, or sometimes ask.  It doesn’t matter if it gets it wrong.. people do… I don’t understand why we are worried about it so much smile !!

 

 
  [ # 26 ]

or when I go shopping, most of the time my debit card works… oh oh!! yes, but once a year or 2 the card doesn’t’ swipe.. do I not use the card? no…  the chances are so small I can ignore them… if that very low chance event is “the correct one” as you say.. well, then I will have to deal with it…. or every time you eat, yes, that low chance that it could be something bad in the food like E Coli or something.  Well, there’s a chance!  But no, you play the odds.

You wouldn’t be able to move at all if you didn’t make assumptions and play the odds.

If you kept track of how many times your bot was right versus wrong when it made the assumption that the man was in pajamas and not the elephant, I think it would be around the same odds as me assuming that my building is still there when I decide if I should drive to work on yet another Monday morning smile

When you are reading this…. are you assuming that, perhaps 1/2 of the words I use aren’t the words that you assumed?  perhaps 1/2 of the words I used in this post aren’t the normal meaning, but they are perhaps acronyms !!!!!  Would it really be useful to assume that?  Especially since not assuming that allowed you to infer enough meaning?

Also, as i stated above, i don’t think we need statistics to define “usually”.

For example, here in Canada, it is **usually** colder in the winter than summer.  In fact I have never seen a day in summer that was colder than any day in winter, and vice versa.

Do I need a corupus of millions of lines for this? no no,  I will hard code that into Grace’s KB.
Yes, perhaps something strange will happen and the Earth will move near a black hole or something ( and please don’t go off into a tangent about physics, i’ll pulling non-sense out of my butt here lol),  but some strange thing will happen, or perhaps 20 million years from now the Earth’s orbit will change and summer and winter will be reversed…but you know what? I’m going to go out on a limb here, and just hard code Grace’s KB to say…. in the northern hemisphere, if you are faced with a choice of whether a given day is hotter in summer or winter, assume it is probably hotter in summer than winter in the northern hemisphere.  Don’t need corpus and stats for that.  Well, I guess we do, but we don’t need to provide the machine with the stats… they have been already calculated and will remain in affect for maybe millions of years… anyway, you get the idea.  I think any human being would probably say that more humans wear pajamas than elephants.  And saying “oh, but I say “A” elephant that was in pajamas *ONCE* does NOT negate that fact that humans usually wear pajamas more than elephants. In fact the only time people would argue the point is—- when they are talking about something like chatbots!!!  Ordinarily you’d never think twice.  But it is just “paralysis by analysis”. 

Laura - that is quite an unexpected dialog for certain!

 

 
  [ # 27 ]

Also, one could perhaps have an option for your bot.

Some people I know would get angry to even be bother with such a trivial thing… they would have “use in-general rules all the time” option would be set for them.

But for others, they wouldn’t want the bot to assume anything even if there was a one in a TRILLION chance of it being wrong !!!

For those people… the only way for the bot to gain knowledge, would be to generate all its parse trees and the user go in and look through each one, and pick the correct one manually…. because the rules the bot could have used to remove 20,000 trees couldn’t be used ... because one in a trillion trillion chance that it **could** be the wrong one.

To each his own !! 

Also, you mentioned casual conversation versus business.

Yes, for a business application, the system would come back with a “confirm” window that would show any assumptions made by the bot, where you could correct it. 

That would probably be a good idea !!

 

 
  [ # 28 ]

Victor, hard coded definition of “usually” surely works for the examples you gave and will be an easy path to take. As for the topic of “gender detecting”, a “receptionist” for an office is usually female, for a hotel is usually male; a “nurse” is usually female, a “taxi driver” is usually male, etc. So if the bot is wrong in making that assumption, it’s actually very human-like! grin

For the fiction yes let’s stop. It’s fair enough to leave this kind of hard issues for later…just want to agree with you on the picture issue and also want to point out that the “tone” is also very meaningful. eg “yes” or “yeah” usually has positive meaning but when someone raises his/her tone and shouts out “oh yeah?”, or a impatient mom answers her child by saying “yes?!”, it will be quite a different story…yet in pure text how can the bot figure it out?  :-(

 

 
  [ # 29 ]

Well, for pure text, I suggest having something like asteriks around words that you want to stress.
That would probably work.

 

 
  [ # 30 ]

Also, getting the answer wrong is secondary.  Generating the options or hypthesis is the most important, and knowing how to evaluate each one.

You could also employ some statistical machine learning algorithms in there, just perhaps as a “tie breaker”.  Example—rules A and B tend to give correct results when field of study “X” is in effect, while C and D are more for field “Y”.

But like humans, machines will have misunderstandings… it is the nature of naturual langauge.  Powerful, but imperfect.

 

 < 1 2 3 > 
2 of 3
 
  login or register to react