This thread brings up the interesting task of answering questions such as: Who wrote “Moby Dick”? Who was the first president of the United States? etc.
At first I started trying to get my logic agent to handle these questions. Then while reading about the architecture of Watson, I came across a citation to OpenEphyra. I guess IBM tried it in early stages of the Watson effort, but it didn’t perform at championship level.
Although Open Ephyra is rather slow, it is able to answer questions such as “Who wrote ‘East of Eden’?”, “Who was the 42nd President?”.
In keeping with Watson’s approach, I think I’ll try to make an agent out of Open Ephyra, as well as continuing to try to make my logic agent handle these types of questions. From the paper on Watson’s architecture:
For the Jeopardy Challenge, we use more than 100 different techniques for analyzing natural language, identifying sources, finding and generating hypotheses, finding and scoring evidence, and merging and ranking hypotheses. What is far more important than any particular technique we use is how we combine them in DeepQA such that overlapping approaches can bring their strengths to bear and contribute to improvements in accuracy, confidence, or speed.
I have 29 agents so far :) And the controller to select among their responses is not very sophisticated. But at least I’m not using UIMA to wrap messages, which seems to me to be a way too overengineered and verbose XML. I think simply passing natural language strings among agents is far easier and forces the programmer to handle natural language constructs such as ambiguous delimiters, etc. at a high level of their programs’ interfaces, thus making the programs more independent (a user can interact with each agent without having to wrap queries in UIMA or whatever).
Anyways, here’s the (heavily abbreviated) output of Open Ephyra:
Question: who was the 42nd president?
+++++ Analyzing question (2011-06-13 13:17:59) +++++
Normalization: who be the 42nd president
Answer types:
NEproperName->NEperson
Interpretations:
Property: IDENTITY
Target: 42nd president
Property: NAME
Target: 42nd president
Predicates:
-
+++++ Generating queries (2011-06-13 13:17:59) +++++
Query strings:
42nd president
(42nd OR “atomic number 60” OR neodymium) president
“42nd president” 42nd president
“42nd president” 42nd president
“was the 42nd president”
“the 42nd president was”
+++++ Searching (2011-06-13 13:17:59) +++++
+++++ Selecting Answers (2011-06-13 13:18:01) +++++
[...]
Answer:
[1] Bill Clinton
Score: 2.6550815
Document: http://www.whitehouse.gov/about/presidents/williamjclinton
Question: Who wrote “East of Eden”?
+++++ Analyzing question (2011-06-13 13:18:33) +++++
Normalization: who write east of eden
Answer types:
NEproperName->NEperson
Interpretations:
Property: AUTHOR
Target: East of Eden
Predicates:
-
+++++ Generating queries (2011-06-13 13:18:33) +++++
Query strings:
wrote East Eden
(wrote OR indited OR pened OR penned OR composed) “East of Eden”
“East of Eden” wrote East Eden
“wrote East of Eden”
+++++ Searching (2011-06-13 13:18:33) +++++
+++++ Selecting Answers (2011-06-13 13:18:35) +++++
Filter “AnswerTypeFilter” started, 523 Results (2011-06-13 13:18:35)
[...]
Answer:
[1] John Steinbeck
Score: 2.3400908
Document: http://www.chacha.com/question/who-wrote-east-of-edan
Question: Who starred in East of Eden?
+++++ Analyzing question (2011-06-13 13:18:47) +++++
Normalization: who star in east of eden
Answer types:
NEproperName->NEperson
Interpretations:
Property: ACTOR
Target: East of Eden
Predicates:
-
+++++ Generating queries (2011-06-13 13:18:47) +++++
Query strings:
[...]
Answer:
[1] James Dean
Score: 0.98565185
Document: http://www.killermovies.com/e/eastofeden/articles/4248.html
Question: what would I use a knife for?
[...]
Answer:
—-
Note it provides no response to the last question, I guess because it isn’t a factoid question. So another agent would have to handle that…