This began as a private correspondence, but since it seemed likely that various people would be
interested, I offer this forum posting.
Pronoun resolution is a shared task between the ChatScript engine and Rosette’s scripts. The
engine handles POS-tagging and deciding on elements to store into user variables representing
various resolutions for pronouns. The engine runs its code on all input and output (subject to
scripts)
At present, only one resolution is kept per variable:
$it_pronoun - everything not covered below which is a singular noun
$he_pronoun - male based names or words (e.g., “Tom” or the actor )
$she_pronoun - female based names or words (e.g., “Sarah Smith” or the actress)
$he-she_pronoun - gender unknown names or words but known to be beings (e.g., the doctor)
$they_pronoun - anded nouns, plural nouns, as well as “groups” of various kinds ( Tom and Jerry, horses, “The Beatles”)
$there_pronoun - proper nouns names of spatial/location places
The bot itself defines $here_pronoun for where it considers itself to be.
The variables are set with the noun phrase, not the noun. So “I saw John’s father” would store
$he_pronoun as “John’s father”, so there will be no ambiguity later about father. Once a variable has been set in a sentence, later words from that same sentence cannot change it.
Rosette then has a topic that preprocess inputs to convert pronoun refererences where appropriate
and resubmit the converted form as the new input. Rosette treats quoted expressions as a single
token, so never resolves pronouns inside it. Thus: John said “He ate meat” ., does not resolve anything.
I consider it error-prone and insufficiently useful to attempt such resolution.
Rosette currently considers replaces: he she it they here there .
A typical rule is:
a: HE ($he_pronoun _* he _* >) ^refine()
b: (his) ^input(‘_0 $he_pronoun ‘s ‘_1 $$punctuation) ^fail(SENTENCE)
b: (himself) ^input(‘_0 $he_pronoun ‘s self ‘_1 $$punctuation) ^fail(SENTENCE)
b: ([‘he ‘him]) ^input(‘_0 $he_pronoun ‘_1 $$punctuation) ^fail(SENTENCE)
which says if we have a he pronoun value and some form of he was used in the sentence, rebuild
the sentence based on which he form was used, submit the new and cancel the current.
A bunch of rules are dedicated to deciding NOT to revise a reference, particularly for the word it.
E.g., “It is raining” would not be decoded, as all weather-related “it be” should be avoided.
Similarly, adding rules for something like “This guy” or “The man” would be this pattern:
a: HE1 ($he_pronoun _* [this that the] ~males _* >