Thanks Andrew and Dave for the warm welcome back.
Apologizes for not being on here much…. but I decided to spend a huge amount of time on my project. Since last time, the main focus was the ability to compare trees and define the “confidence formula” , which *currently* consists of x weights. So when Abel is trying to find a fact, f, to satisfy all the requirements of your question, q, it may have several facts, say f1…f(n) to choose from. But, what if all those facts have variable number of modifiers in them that the question didn’t have? So a good example is (to use Steve’s “red truck” example)......
fact1: Joe bought a big red truck with big tires.
fact2: Joe bought a truck.
if that was your fact, but your question was less complex, so “Did Joe buy a green truck?” , the system calculates the number of “sub tree deltas” between fact and question parse trees. In this case, Abel would calculate a higher confidence (because of less number of “modifier deficiencies”), so it would use fact2 (it is closer to the question than fact2), because we simply have to state that it wasn’t a “green” truck, just “a” truck.
It turned out to be quite a job for it to handle these *on its own*. It has to calculate:
a) “fact deficiencies” (if any), so this is when your question has more modifiers than your *closest fact*, example: “Did Joe buy a big red truck?” (when the closest fact is “Joe bought a truck”—so the *fact* is deficient by modifier ‘big’ and ‘red’ (which were in the question).
b) “question deficiencies”—the reverse.. the fact has more modifiers than question…. “fact- Joe bought a big green expensive truck”, “Q: Did Joe buy a truck?”
That was relatively simple, because you are only calculating “term modifier differences”. But Abel (has a *start*) on being able to handle when there are variable number of , what I call , “sub tree modifiers” (STM)..... example…
“Joe bought the truck that his brother sold him”—here, an entire subordinate clause (“that his brother sold him”), is actually a modifier of truck in the parse tree. Then, the question can either have or not have that STM :
F: Joe bought the truck that his brother sold him
Q: Did Joe buy a big truck? (delta is: ‘the’ in fact, ‘a’ and ‘big’ in question, also fact has SC STM “that his brother sold him”)
OR, the question has a more complex parse….
Q Did Joe buy a truck that his big brother sold him.
The system can actually compare the direct objects *OF* the STMs. An entire perl module was written just to be able to handle that—in the general case (that is , variable number of STMs and term modifiers).
things REALLY got cute when the fact has two STMs and question has zero, one or two….
The tree comparison routines are not finished yet, but I -do- have considerable percent completed (can’t give an estimate off the top of my head). For example, I don’t yet have it down to comparing adverbs….
Fact: Bob bought a very expensive car.
Question - Did bob buy a nice car
right now it would report that there is a diff .. that fact has “expensive” but question has “nice” .. to inform you off that difference, but if you asked
Did bob buy a very expensive car
it would , right now, report “Yes, that exactly matches my knowledge”. Later, the STM/Term Tree compare routines will go right down to the adverb level.
Things really got cute when I started on functionality (which is where I am right now)....
(note—This is only an example (it doesn’t ‘understand’ this statement yet, more world knowledge needed, but once I provide that, the tree-compare library will handle finding the ‘delta’ ).....
fact- “Bob bought the movie that I liked but my wife hated”
when i provide the world knowledge inference rules so it can deduce that “that i liked” and “my wife headed” can possibly modify “movie”, then the following questions can be asked:
question - “Did Bob by the movie that my dear wife hated?”
as you can imagine, it gets pretty tricky—the system has to first enumerate both the “term” modifiers (like ‘the’ and ‘dear’), and the STMs, and, then iterate through the STMs of the question and STMs of the fact, to find a suitable match. Also, achieved, was the fact that you can have, not only *term synonyms*, but, what I call, ‘sub-tree-synonyms’... which means it can consider that things like “that my wife hated”, or “that my dear wife despised”, to be synonymous.
Oh… and the very last ““piece of the puzzle” was that I built a web GUI to help with adding new ‘grammar production knowledge’. A functionality I will have to add really soon is to hookup the web gui to the command-line “add new word” utility (which takes in say a verb form and conjugates it, and asks you to confirm, and then adds it to its lexicon.. same for adjectives and nouns). so, it will be nice to add that to “Kned” (knowledge editor), the new GUI tool.
So…. yeah… pretty darn busy on the project !!!!!!!!!!!!! What about you guys?!