AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Pronoun resolution and Rosette
 
 

This began as a private correspondence, but since it seemed likely that various people would be
interested, I offer this forum posting.

Pronoun resolution is a shared task between the ChatScript engine and Rosette’s scripts. The
engine handles POS-tagging and deciding on elements to store into user variables representing
various resolutions for pronouns. The engine runs its code on all input and output (subject to
scripts)

At present, only one resolution is kept per variable:
$it_pronoun - everything not covered below which is a singular noun
$he_pronoun - male based names or words (e.g., “Tom” or the actor )
$she_pronoun - female based names or words (e.g., “Sarah Smith” or the actress)
$he-she_pronoun - gender unknown names or words but known to be beings (e.g., the doctor)
$they_pronoun - anded nouns, plural nouns, as well as “groups” of various kinds ( Tom and Jerry, horses, “The Beatles”)
$there_pronoun - proper nouns names of spatial/location places

The bot itself defines $here_pronoun for where it considers itself to be.

The variables are set with the noun phrase, not the noun. So “I saw John’s father” would store
$he_pronoun as “John’s father”, so there will be no ambiguity later about father. Once a variable has been set in a sentence, later words from that same sentence cannot change it.

Rosette then has a topic that preprocess inputs to convert pronoun refererences where appropriate
and resubmit the converted form as the new input. Rosette treats quoted expressions as a single
token, so never resolves pronouns inside it. Thus: John said “He ate meat” ., does not resolve anything.
I consider it error-prone and insufficiently useful to attempt such resolution.

Rosette currently considers replaces: he she it they here there .

A typical rule is:
a: HE ($he_pronoun _* he _* >)  ^refine()
  b: (his) ^input(‘_0 $he_pronoun ‘s ‘_1 $$punctuation) ^fail(SENTENCE)
  b: (himself) ^input(‘_0 $he_pronoun ‘s self ‘_1 $$punctuation) ^fail(SENTENCE)
  b: ([‘he ‘him]) ^input(‘_0 $he_pronoun ‘_1 $$punctuation) ^fail(SENTENCE)
which says if we have a he pronoun value and some form of he was used in the sentence, rebuild
the sentence based on which he form was used, submit the new and cancel the current.

A bunch of rules are dedicated to deciding NOT to revise a reference, particularly for the word it.
E.g., “It is raining” would not be decoded, as all weather-related “it be” should be avoided.

Similarly, adding rules for something like “This guy” or “The man” would be this pattern:
a: HE1 ($he_pronoun _* [this that the] ~males _* >

 

 

 

 
  [ # 1 ]

Hi Bruce,
hi all,

If someone is interested, the subject of our correspondence
is my concept of a multi-lingual “SPEECH CENTER BY DRAWERS” which can be downloaded here:

http://sourceforge.net/projects/maldix/files/

Several of my chin-ups to recieve a new search-sentence,
for example “He is playing guitar.” in a talk about “Peter” in “Peter is playing guitar.”
is solved by ChatScript, but I wasn’t able to see it,
because Rosette, which I took to test ChatScripts pronoun-resolution, is lacking some data yet.

Bruce said above “At present, only one resolution is kept per variable”,
because I proposed a little pronoun-stack to him, which I explained like this:

If there is a talk about Aztecs like:

CS: What have you done today?
USER: I watched a documentation of Montezuma today.
CS: He killed six prisoners of war for his god of war. Continously.
USER: This guy must have been a real pig. 
CS: I think he was.
USER: But Cortez was twice as brutal as Montezuma.
CS: Cortez always dreamt about being rich.
USER: The TV-moderator also said so.

So we have 3 candidates for a “he”, which are stored in the “he”-drawer
(its board A, drawer 3, for the storage of 3.PS_MALE_SINGULAR-candidates WITHOUT quotation-marks)

It looks like this:

“he”-candidate-drawer:
slot 1: (empty) # because no “he”-candidate was mentioned jet.
slot 2: (empty)
slot 3: (empty)

After this sentence: USER: I watched a documentation of Montezuma today.
it looks like that:

“he”-candidate-drawer:
slot 1: (Montezuma) # because Motezuma is MALE_SINGULAR
              # and neither user nor chatscript
slot 2: (empty)
slot 3: (empty)

Because “This guy” is replaced properly with “Montezuma”,
the drawer changes after the sentence:
USER: But Cortez was at least as brutal as Montezuma.

“he”-candidate-drawer:
slot 1: (Cortez)
slot 2: (Montezuma)
slot 3: (empty)

and after: USER: The TV-moderator also said so.

“he”-candidate-drawer:
slot 1: (TV-moderator)
slot 2: (Cortez)
slot 3: (Montezuma)

This enables ChatScript to solve an ongoing conversation like:

USER: A poor man!
(ChatScript is “looking” into the “he”-canditate-drawer
and always trying slot 1 first, so his first reaction is:)
ChatScript: TV-moderators are rather rich I think.
USER: I´m not talking about the moderator.
ChatScript: About Cortez?
(After the ellipses-reconfiguations given at the end of PART I,
ChatScript will “think” “Are you talking about Cortez?”)
USER: No.
(After the ellipses-reconfiguations given at the end of PART II(!),
ChatScript will “think” “I am not talking about Cortez.”)
ChatScript: Ah, you mean Montezuma?
USER: You got it!

After a new “he”-candidate is appearing, for example in:
USER: Uups, my father is calling.
the drawer is looking like this:

“he”-candidate-drawer:
slot 1: (user-father)
slot 2: (TV-moderator)
slot 3: (Cortez)

because “Montezuma” is removed from the drawer.

Perhaps this idea,
my way to resolve a “we”
my considerations to solve pronominal adverbs,
or my approach to ellipsis-resolution will be a good basis for
a discussion here and will lead to improvements of my thoughts.

Bruce, you may be right that my resolution of pronouns in quotation marks is error-prone.
But my last point about it was not to go inside, but

if (there are pronouns inside of quotation marks)
ask: [Who?] is this ([he] [she] [we] [you/PLURAL] [they]?

I don´t aggree with you,
that keeping an eye on things said to ChatScript-bots in quotation marks is “insufficiently useful”,
because from the first REAL chatterbot-user on (Weizenbaums secretary)
people are telling very important personal things to bots.

And all my psycho-linguistic sources say,
that the probability to find personaly relevant information
is much higher than anywhere else,
when they are quoting other people literaly.

So I don´t understand,
why ChatScript should stay blind for informatons like
USER: Paul said: “She is cheating!” even if USERs wife is meant.
Why doesn’t ChatScript inquire at least?

All the best

Andreas

 

 
  [ # 2 ]

I was asked:

So I don´t understand,
why ChatScript should stay blind for informatons like
USER: Paul said: “She is cheating!” even if USERs wife is meant.
Why doesn’t ChatScript inquire at least?

It is not up to the chatscript ENGINE to make such inquiry. The engine stores possible antecedents. The specific chatbot script must decide if and how to resolve pronouns. I don’t, in Rosette, because I have so many other things to do that I feel have higher priority. Since a quoted string has a canonical form of ~unknownword, it is technically feasible for a pattern to detect such words, “^burst” the original form and see if they have any pronouns in them, and so do pronoun resolution if one wanted to.

 

 
  [ # 3 ]

Hi Bruce,

I ratify your priority-argument gladly,
because I am very fascinated by the possibilities of ChatScript already
and I´m all on edge what comes next:)

All the best

Andreas

P.S.: I will try to get familiar enough with ChatScript to implement my stacks by my own.

 

 
  login or register to react