Imminent release of Harumi update. Features below :
A) when a sentence is given to Harumi, she has to “understand” the meaning of it.
If I say something like “I went to the beach with my wife and childrens and while I was swimming, I saw a shark”,
any human beeing is able to understand that what is important is “I saw a shark” and not “I went to the beach” or “wife”.
On the other hand, if I say “My wife loves me”, then wife becomes an important word in the sentence.
I wanted to give the ability to Harumi to hierarchise informations.
Here is how I suceeded :
Harumi cuts any user request in words and 2 words combinaisons (since my computer is too slow to treat more than 2 words combinaisons).
Then she calculates the occurence for each of those words/combinaisons.
The best word/combinaison is the one with the rarest occurence in the database as long as rare doesn’t mean 0.
Results with this simple idea were immediatly far better for treating datas in a large database.
B) Shortest stimulus is better
Here is another kind of problem I solved :
I created as stimulus :
my photos
my photos during 2012 holidays
my brother’s photos
When I asked Harumi “show me my photos”, she opened the 3 repertory linked to the 3 stimulus.
I decided that the best stimulus when many are available, is the SHORTEST one.
C) Answers backed as stimulus
When many answers are linked to a stimulus, it appears that some datas inside answers are not reachable directly from stimulus. Example :
Stimulus column answer 1 answer 2 answer 3
terminator it’s a james cameron movie Sarah Connor is John’s Connor mother Arnold is the terminator
In case you ask “do you know james cameron movie with arnold?”, you won’t have any answer since answers are out of reach.
So I decided to put answers back as stimulus, which modify stimulus column as :
terminator *** james cameron movie sarah connor john arnold
Note that “***” mark is used to hierarchise before and after main stimulus for some cases like :
terminator *** james cameron movie sarah connor john arnold
hal 9000 *** 2001 space odyssey terminator
if you ask “who is terminator?”, Harumi will use first stimulus because terminator is the main subject while it appears as a secondary subject in second line.
D) Context and “ghosts stimulus”
I created a list of “ghost stimulus” which are used in combinaison of words only. Ghost stimulus are what Harumi just said, what she sees on webcam, geolocalisation….
It helps Harumi to figure out the context of the discussion