|
Posted: Jul 15, 2010 |
[ # 106 ]
|
|
Senior member
Total posts: 257
Joined: Jan 2, 2010
|
*Update*
I spent some time creating the following draft document tonight. It outlines some thoughts on creating sentences describing ‘where’ Walter is located.
http://www.chuckbolin.com/walter/Walter_Location.pdf
I’m open to ideas…please send them my way.
Communicating location is a ‘two-way’ process. Walter needs to describe his location in clear and varied terms. In addition, Walter must respond to commands to reposition himself to another location. This technique should suffice….but I need to test.
Regards,
Chuck
|
|
|
|
|
Posted: Jul 15, 2010 |
[ # 107 ]
|
|
Administrator
Total posts: 3111
Joined: Jun 14, 2010
|
I can’t see anything I would change, add to, or remove from that, Chuck. It looks very good to me. I’d be interested in seeing the structure of the code that determines and outputs the responses, however.
|
|
|
|
|
Posted: Jul 15, 2010 |
[ # 108 ]
|
|
Senior member
Total posts: 974
Joined: Oct 21, 2009
|
Nice, I see you are starting to focus on grammar and how it applies to ‘real world physics’. I notice you have “I am ....” may I also suggest also contractions (Walter should also sometimes say “I’m at the .....”.
|
|
|
|
|
Posted: Jul 15, 2010 |
[ # 109 ]
|
|
Senior member
Total posts: 257
Joined: Jan 2, 2010
|
Thanks!
At present I’m considering that every question (normal conversation) has a specific meaning or intent (ignoring sarcasm and other exceptions). However there are multiple ways of expressing this meaning or intent So, for the moment my questions are translated into a specific function call unless the meaning is not understood.
So “Where are you?” is translated to a function call such as bRet = GetBotInformation(sResponse, CONST_WHERE_ARE_YOU). Of course there are multiple ways to answer this question. I’m experimenting at the moment with how to determine the most human like response. If Walter is ‘geeky’ and ‘silly’ (personality module) then he might choose something a bit off from normal.
You can see that the phrase, regardless of language, results in the same function call.
When a question is not understood, Walter will ask followup questions design to identify the likely meaning. Of course I haven’t begun to work on this.
After I get this simple English “wher are you?” finished I’ll work on supporting Spanish and German…until it becomes too painful.
Thanks for the encouragement.
Regards,
Chuck
|
|
|
|
|
Posted: Jul 17, 2010 |
[ # 110 ]
|
|
Senior member
Total posts: 257
Joined: Jan 2, 2010
|
*Update*
I’ve started discussions with my son (comp sci guy) about uploading real time data about Walter to Walter’s very own web site. He has a few ideas on how to develop the database and to get the data from a PC app to the web site. I thought it would be interesting to check on Walters wanderings to and fro. More to come…
Smell! I’ve worked out a simple algorithm involving smells. The objective is to assign smell types to objects that are added in the virtual world. I’ll describe the smell ‘intensity’ by a radius. Within the radius, one cannot discern the direction of the smell very accurately. It’s omni-odiferous….I made this word up =) The algo takes in wind direction and wind speed to move the smell down wind.
TOO MANY TANGENTS!!! I spend a lot of time ‘not coding’ Walter. Consequently I find myself thinking about various aspects of this projects. What I haven’t done is work out a design spec…a living doc…to capture my ideas. Maybe I’ll do that today…start that is.
I found a book on my son’s bookshelf called “Artificial Intelligence Illuminated” by Ben Coppin. I thought about reading that over the weekend to see if it stimulates any ideas.
I just spent 6 hours working in the high heat and humidity of South Carolina…and I don’t really care to do too much…so reading and writing a spec may be asking a bit too much. =)
Regards,
Chuck
|
|
|
|
|
Posted: Jul 18, 2010 |
[ # 111 ]
|
|
Administrator
Total posts: 3111
Joined: Jun 14, 2010
|
Here’s hoping the heat/humidity don’t get the best of you, Chuck. Having lived in Tennessee for a while, I know what it feels like to experience 90°+ temp with 80%+ humidity, and I don’t envy you that in the least. Today, the temp was 92°, but the humidity is only 13%, and it was STILL too hot for me. We even had a nice breeze, but it didn’t help much. :(
I feel your pain, regarding the number of “tangents”, as I’ve also got to contend with them. I’ve taken to having a graph-paper tablet next to my desk, to write down things that I think of that need to be taken care of. Too bad the silly thing lies there, mostly unused. It takes more discipline than I currently have to make use of the notepad. Perhaps I need an on-screen “sticky-pad”, to type out my notes, rather than taking the time to stop what I’m doing, get the pad, and write things down. But, then again, that’s another of those dratted “tangents”, isn’t it? Still in all, that may save more time than it will take to write it. Hmmm…
And the circular thoughts begin…
|
|
|
|
|
Posted: Jul 18, 2010 |
[ # 112 ]
|
|
Senior member
Total posts: 974
Joined: Oct 21, 2009
|
Humidity? Well, here in Ontario Canada it has been 30C, and 43 C (or 109 F )... ok .. to a Canadian that is way too much.. yes.. its true, it’s usually cold here.. 10 months of winter and 2 months of ‘bad hockey weather’ lol.
Chuck, I *more* than understand —documentation ALWAYS takes a “back seat” to development… I see some of the documentation I wrote on my bot and just laugh.. . how irrelevant that is now. The approach I am attempting in avoiding the “too many tangents” issue is to have a bot that learns. I have decided I am not going to bother with the Loebner prize; It will be too apparent that my bot is a bot based on the fact that it won’t understand many complex words initially.
What I am focusing on, and I don’t know if this helps you any, is to not code a huge set of information, but rather have the bot learn, via chat, to understand new concepts.
In other words, and I have wanted to create a topic on this, is I want to have NLP understanding of very basic concepts, a kind of “NLP language”. All ‘knowledge’ will be parse trees, all “If-then” rules will be parse trees. And all data in it’s “database”, or “knowledge base” whatever you want to call it, will be parsed English sentences (yes, later any language).
Now , by “parse tree” here is an example of my bots eval of the input “I went to town”
pos = simple-sentence
num-predicate = 1
predicate1.num-verb = 1
predicate1.verb1.num-prep-phrase = 1
predicate1.verb1.prep-phrase1.num-object = 1
predicate1.verb1.prep-phrase1.num-prep = 1
predicate1.verb1.prep-phrase1.object1.val = town
predicate1.verb1.prep-phrase1.prep1.val = to
predicate1.verb1.val = went
subject.noun1.val = i
subject.num-noun = 1
There you see how it parses the input, it identifies the subject (“I”), the predicate verb “went” and the prepositional phrase “to town” being applied to ‘verb1’ (‘went’)
|
|
|
|
|
Posted: Jul 18, 2010 |
[ # 113 ]
|
|
Senior member
Total posts: 257
Joined: Jan 2, 2010
|
Hi,
I guess I can’t complain too much about the weather. “..bad hockey weather…” =) I traveled through NV in June back in ‘86. No AC. It was hot…but I didn’t break into a sweat….I guess it evaporated due to the low humidity.
I actually built my first parse tree in Excel VBA this week for work. I needed the flexibility for accounting data to ‘roll up’. So I’m ready to apply that knowledge to parsing in Walter.
Vic, I see your point about a ‘learning’ bot. The downside of course is the time required to ‘teach’ the bot. I understand several bots utilize the internet. Of course, training a bot this way is like allowing a bunch of misfits to raise your newborn baby. =) This could keep an administrator or team busy for years.
Dave’s bot Morti has a lot of topics with question patterns and responses. I believe all Dave can do with Morti is to add more topics and responses. Is that correct? I’ve not dug into Morti’s architecture or others similar to him.
Would it be possible to use the knowledge base of questions and answers with AIML (such as Morti) and then apply your concepts to your bot…so it actually digs deeper? It learns, stores and categorizes knowledge, all in conjunction with these AIML questions?
In fact, it occurs to me that I could build a simple C++ app that would chat with people using the AIML library. Perhaps ‘tweaking’ the code would drive me to add more functionality such as ‘learning’.
Just a bunch of rambling. My wife has some ‘special’ creamer in a big bottle to my coffee. =) I’m going to get some more.
Dave,
I’ve got several notebooks. My problem is I capture ideas…but then I use the paper to work out a lot of code. My ideas get lost. I’ve just review some of my game specs…10 to 20 pages at the most. I really need to create a document and build the spec…and force myself to add ideas as they come about.
Regards,
Chuck
|
|
|
|
|
Posted: Jul 18, 2010 |
[ # 114 ]
|
|
Senior member
Total posts: 974
Joined: Oct 21, 2009
|
Chuck Bolin - Jul 18, 2010: I understand several bots utilize the internet. Of course, training a bot this way is like allowing a bunch of misfits to raise your newborn baby. =)
This could keep an administrator or team busy for years.
LOL ! I agree 100%. I am not going to have my bot learn from the net—at least not “unattended”. I may, review a page before allowing my bot to consume it - a kind of screening process. Admin team busy for years - yes absolutely. I have in front of me a lot of work to do in terms of creating GUI tools to manage what I’m calling KEPT KB. KEPT = knowledge entry parse trees.
For example the fact that “Bob lives in Toronto” would be stored as
datetime_dayofweek_gmt = sunday
datetime_dayofweek_local = sunday
datetime_gmt = 2010.07.18,11:45:55
datetime_isdst_local = 1
datetime_local = 2010.07.18,07:45:55
num-predicate = 1
parse-tree-id = 1
pos = simple-sentence
predicate1.num-verb = 1
predicate1.verb1.num-prep-phrase = 1
predicate1.verb1.prep-phrase1.num-object = 1
predicate1.verb1.prep-phrase1.num-prep = 1
predicate1.verb1.prep-phrase1.object1.val = toronto
predicate1.verb1.prep-phrase1.prep1.val = in
predicate1.verb1.val = lives
subject.noun1.val = bob
subject.num-noun = 1
That parse tree will be matched up with the parse tree of the question “Where does Bob live?”
Yes, it may appear very much like ‘overkill’ to store things this way, instead of having a DB Table called “Address” with column for name and column for City lived in, but I intend on having the knowledge very ‘jagged’. Another KEPT could be the parse tree for something like:
“The IP address to access the antivirus dat file server for client XYZ, from network B is 192.168.2.15”
And then question, “What’s the IP address to access the av server for XYZ?”
System would come back with something like “From which network ? From B, it would be 192.168.2.15”
This is going to be my bot’s first job - searching for information and digging through complex information to deduce a response.
At work, as you probably gathered from that sample sentence, I work in computer security, AV, firewall, intrusion detection systems. We have a server dedicated to documents. Let me tell you, you have to guess and guess what the correct “search terms” are to find what you need, it is very annoying. My dream is to have a bot that can DISCUSS, and clarifying what you need, by conversation, to find a document, deduce a value, deduce a statement, whatever.
The reason a simple database table won’t work is the statements of knowledge are to complex, and cannot be predetermined - English does not lend itself well to the ‘over-simplifications’ of a SQL table!
Here is a much more complex example, where we are using a “past participle phrase” to modify the subject noun “doctor” with phrase “called to the scene”...
the full sentence input is “A doctor, called to the scene, examined the injured man”...
here is the PT (parse tree):
parse-tree-id = 4
pos = simple-sentence
num-predicate = 1
predicate1.comma-prefix = true
predicate1.dcomp.noun1.adjective-list-type = space
predicate1.dcomp.noun1.adjective1.val = the
predicate1.dcomp.noun1.num-adjective = 1
predicate1.dcomp.noun1.num-verbal-past-participle = 1
predicate1.dcomp.noun1.val = man
predicate1.dcomp.noun1.verbal-past-participle1.val = injured
predicate1.dcomp.num-noun = 1
predicate1.num-verb = 1
predicate1.verb1.val = examined
subject.noun1.adjective-list-type = space
subject.noun1.adjective1.val = a
subject.noun1.num-adjective = 1
subject.noun1.num-past-participle-phrase = 1
subject.noun1.past-participle-phrase1.comma-prefix = true
subject.noun1.past-participle-phrase1.num-object = 1
subject.noun1.past-participle-phrase1.num-past-participle-verbal = 1
subject.noun1.past-participle-phrase1.num-prep = 1
subject.noun1.past-participle-phrase1.object1.adjective1.val = the
subject.noun1.past-participle-phrase1.object1.num-adjective = 1
subject.noun1.past-participle-phrase1.object1.val = scene
subject.noun1.past-participle-phrase1.past-participle-verbal1.val = called
subject.noun1.past-participle-phrase1.prep1.val = to
subject.noun1.val = doctor
subject.num-noun = 1
so the bot knows that the subject (doctor) did not actually DO the verb of “called” .. but instead is being modified .. . it was “the” doctor.. what doctor? a doctor THAT WAS called to the scene.
dcomp means direct compliment.. the receiver of the action of the predicate verb.
|
|
|
|
|
Posted: Jul 18, 2010 |
[ # 115 ]
|
|
Senior member
Total posts: 257
Joined: Jan 2, 2010
|
Vic,
Is your code creating these parse trees shown above? This is very impressive. I’m guessing it is. So, have you put some real crazy sentences into the parser and evaluated the results? I mean, is there a point where the parser begins to choke and get it wrong?
One thing I realize is I need help with grammatical terms. =) I believe your wife is your grammatical guru? I need to reel my wife in to assist. She homeschooled our kids for 22 years now…and is still doing it.
Storing knowledge in predefined and very specific tables isn’t really human like…however queries are much faster. I’m very interested in a memory system that is not a database designed system. However, I have no idea how to go about that…or what the advantages are to doing so.
I understand the desire to implement some AI at work. I support a lot of financial analysts who spend a considerable amount of time preparing data before it is analyzed. I’ve started work on ‘pre-analysis’ code that will assist the analyst. In addition, the code scans a spreadsheet and finds data tables, classifies the type of data and issues a report on trends, spikes and other unusual things. This is in development but initial tests for some ‘alpha’ code seem quite promising.
Gotta go.
Chuck
|
|
|
|
|
Posted: Jul 18, 2010 |
[ # 116 ]
|
|
Senior member
Total posts: 974
Joined: Oct 21, 2009
|
Chuck Bolin - Jul 18, 2010: Vic,
Is your code creating these parse trees shown above? This is very impressive.
I’m guessing it is.
Yes, it is, that is a copy-and-paste from the terminal window.
Chuck Bolin - Jul 18, 2010:
So, have you put some real crazy sentences into the parser and evaluated the results? I mean, is there a point where the parser begins to choke and get it wrong?
Not sure what makes a sentence ‘crazy’ but it does handle complex sentences, sentences with main clause/subordinate clauses, noun clauses, infinitives, gerunds, and as shown above, participles.
Does it ever get it wrong? right now, it hardly ever gets it right ! What happens is, as I mentioned in earlier post, stage 2 generates all the ways that a sentences could theoretically be interpreted as (according to grammar).
Then stage 3 - or “meaning assignment” is done. This stage involves many rules and association of word properties. Most times, right now, since my bot does not know too much about the world, it does not know which parse tree is the right one. But I am putting more rules in to it all the time, and for example it knows things like people wear clothes in general more than animals, thus it deduces “in my pajamas” probably applies to “I” rather than elephant in the sentence “I shot an elephant in my pajamas”.
So, right now, it hardly gets a very complex sentence correct. I am not sure how long it will take, perhaps I will never have all the world knowledge it will take, to make it understand every possibility.
The thing I would like to do is, instead of right now, where I have to go into the code and add a ‘world knowledge rule’ or add ‘world knowledge data’ by editing Perl scripts and text files that give word definitions, a nice quick GUI. That will speed up the process immensely. Right now all work on the chatbot is with a text editor for the code and running it from command line (no gui). I didn’t want to work on anything else except the very heart of the parsing logic.
Chuck Bolin - Jul 18, 2010:
One thing I realize is I need help with grammatical terms. =) I believe your wife is your grammatical guru?
As part of this project, before I started writing the latest design of a parser, I spent 2 months re-learning English formally! My wife isn’t the grammar guru but , when I told her about my project, she went and dusted off her “Webster’s Encyclopedic Dictionary of the English Language” ... and opened it to the 50 page chapter on grammar… “Here you go.. everything you need to know!” She provided that to me, oh, summer of 2008, and I have been working from it ever since.
Chuck Bolin - Jul 18, 2010:
Storing knowledge in predefined and very specific tables isn’t really human like…however queries are much faster.
Yes, tables are designed to be quick, databases use indexes which are great for lookup efficiency. but like everything in life, there is a trade off . . efficiency versus flexibility !
|
|
|
|
|
Posted: Jul 18, 2010 |
[ # 117 ]
|
|
Administrator
Total posts: 3111
Joined: Jun 14, 2010
|
I could spend all day, listening to these types of conversations. Too bad my sleep cycle interferes.
@Victor
Fascinating stuff. I’m learning more all the time. Thank you.
I noticed in one of your parse trees that you have the date being stored several times, in different ways. I’m wondering of it would make sense to simply store the UNIX timestamp for GMT, and derive any date strings from that, programatically. Manipulating strings is much faster than retrieving data, in most cases.
@Chuck
We seem to have the same “failings”, when it comes to keeping track of notes. My desk is so cluttered with miscellaneous bits of paper, none of which I dare to toss out, in case it contains something important. It makes it hard to work, sometimes. :(
|
|
|
|
|
Posted: Jul 18, 2010 |
[ # 118 ]
|
|
Senior member
Total posts: 974
Joined: Oct 21, 2009
|
@Dave - thanks. This is no small programming project… easily the longest term one I have ever embarked on… multi-year, perhaps multi-decade. I’m not sure if I will simply store Unix epoch time, hard drive space is SO very cheap now a days, 100$ for 1 TB, that I may just as well have it stored both as unix timestamp, for relative calculations, and have it also spelled out month, day, year, just for debugging and human readability.
|
|
|
|
|
Posted: Jul 18, 2010 |
[ # 119 ]
|
|
Senior member
Total posts: 257
Joined: Jan 2, 2010
|
Hi,
Well, I put a couple of hours into creating a design spec. I’ve been working on it this afternoon. I’m taking all those notes from various sources and blending them together. In fact, I found some written notes in my Bible. I guess I should have been listening to the sermon. =)
I’ve decided that Walter is my long-term ‘hobby’ project. Like Vic I could see it lasting for some time. However, it does free me to enjoy other interests knowing that there is no specific deadline. Dave, how long have you been working with Morti?
50 page chapter on English grammar? Wow! I guess I need to delve into that topic.
Back to writing.
Chuck
|
|
|
|
|
Posted: Jul 19, 2010 |
[ # 120 ]
|
|
Senior member
Total posts: 257
Joined: Jan 2, 2010
|
*Update*
Here’s the initial specification. It’s 26 pages long and consists of various notes I’ve put together. I’ll use it as a basis for future design and development. It is not a complete specification but rather my initial efforts to think this through.
http://www.chuckbolin.com/walter/spec.pdf
Regards,
Chuck
|
|
|
|