Ok, so everybody likes Wikipedia, so we’ll start with the URL
http://en.wikipedia.org/wiki/Anaphora_(linguistics)
save you one step of googling it. Oh, chatbots.org doesn’t know “googling” (or chatbots.org lol, Erwin/Dave, update your word database).
OK… So, to avoid trouble, I have decided to open each topic on a VERY specific issue.
Today I am finally at the point of coding the logic to handle Anaphoric references. I am first tackling the ‘back reference kind’.
Example
1—“We gave the monkeys the bananas because they were hungry”
2—“We gave the monkeys the bananas because they would have gone to waste anyway”
so in 1) ‘they’ is the monkeys (obvious to you and I) and in 2) ‘they’ is the bananas.
But that is a more advanced example.
The kind I am writing the semantic rules for today in my bot is:
3. Jack stopped by because he owes me money.
4. Jack called Tom because he owes me money.
In #3 when asked “Who owes me money” it is obvious, “he” maps back to “Jack”, but in 4. “he” is ambiguous.
Right now, I am tackling this as follows:
When asked “What did Jack do”, the bot searchs all facts to find one where the subject = Jack. So the connection is made with the subject of the *main-clause*. Then, we know from the parse tree that we also have a suborindate clause “he owes me money”.
Now, the way I’m doing this, (and I am intrested in how all of you are, because of the vast differences in our approaches), is :
The bot sees a subordinate clause and checks the subject of it. The subodinate clauses’ subject is “he”. It then gets the name of the main-clause (the subject of the main clause), “Jack”, and checks its database to see if that represents a male name. (Same is true if “she”, then checks if main-clause subject was a female name). If that checks out, it goes on to next step.
The next step—and this is where you have to watch—we can’t just jump to conclusions here. We can’t just map ‘he’ to ‘Jack’.
The reason being, what if there was a prepositional phrase or perhaps predicate noun with a person name? such as in #4 above.
So, the bot scans the main clause, and asks itself, do I see any references to people’s names (other than of course the subject).
in #3 above, it sees none (none other than the subject has a person name).
Thus, it is obvious. We may do the map of ‘he’ to ‘jack’.
In #4, the ‘scan’ routine comes back with the node in the parse tree for ‘Tom’. Since a non-empty list is returned it means “Yes, there is other people mentioned in the main clause”. Thus, the bot is confused. BUT, the good thing is, it knows it is confused. It knows it must ask. “Ok, who owes you money, Jack or Tom”
What is everyone else’s approach?