AI Zone: chatbots.org

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Tables and Loebner questions

Posted: May 8, 2011

Bruce Wilcox

Moderator

Total posts: 2372

Joined: Jan 12, 2010

E-mail Bruce

Chatbots fall into two broad classes, the hand-crafted and the brute-force. Hand-crafted bots, like ALICE and SUZETTE, have the advantage of being able to create a consistent persona at the cost of having to hand author everything. Brute-force bots either harvest the internet or people’s chat, like Cleverbot. They have the advantage of it being easy to add new data at a cost of having that data be completely schizophrenic.

Historically, brute-force wins in AI every time. It won in chess, it is winning in Go, it is winning in language translation and search, and it wins in manufacturing. It wins because it overlooks nothing and costs less to produce. All it requires is powerful enough hardware, and time is in its favorite for that.

So we who tilt at windmills with our hand-crafted bots strive to improve the ease of adding rules. My focus in ChatScript for that is the table. Tables make it easy to add data. Take, for instance, the simple question of “what is your favorite xxx”. This kind of question showed up twice in last year’s qualifiers, first as food and then as color.

Writing patterns for “what is your favorite color” is easy. And if you put all the patterns that represent that meaning in a macro that takes the what as a argument, it becomes something like

?: ( ^WHATFAVORITE(color)) My favorite color is green.

But imagine doing some 400 favorites, in a variety of different topics. You have to scroll around to the right topic, type in that whole pattern stuff. Or you can be lazy, using a table. I create a topic of “favorite” and use a table to arrange the data. I can accommodates single and double word phrases. E.g.

Table: favorites (^topic ^modifier ^what ^result)
^createfact((^modifier ^topic ^what) favorite ^result )
DATA:
~books _ book “My favorite book is Time Enough for Love.”
~books Dr._Seuss book “I like The Cat in the Hat.”

Assuming you have patterns that detect the structure of the “what is your favorite ”, you can write them to detect one or two-word requests, and store the base request in _1 and the modifier (if there is one) in _0. If there is no modifier, just store the _ character. Then you can do a query like this:

query(direct_vo ? favorite _1 )

which retrieves all favorite facts about _1. You can then loop through the facts and see which, if any, match the modifier _0. Finding a match, you print out the result AND, you set the current topic to the topic associated with the fact. That way, conversation continues in a related subject area. So while the match was in the favorites topic, the continuation is in the topic appropriate to the answer.

You are allowed a shortcut in a table for a bunch of synonyms, like this:

~music [rock musical music] group “I like Kiss”.

Which generates separate facts for each modifier.

I improve on this notion by adding the why in the result, like “I like dogs because they are cute”, and then “^burst” the result so as to get the part before because and the part after. I randomly display the full sentence, or just the first part, hoping to tease the user into asking why. The rejoinder will then display the second part, if there is one.

I further improve on this sometimes by allowing concepts into the table.

~computers _ ~chatbotlist “My favorite robot is the computer on Star Trek. It was always so easy to fool.”

But this requires changing the query, because the base item will not be a word of the sentence. The query must see if the base item is ultimately a member of the concept. The query becomes query(direct_v<o ? favorite _1 ). This query looks for facts where the verb is given but the object can trace a path from left to right ( < means left to right), using membership. This sophisticated search will see if the word given is a member of what is given as the object. So if the actual object given is “chatbot”, it will check each favorite fact to see if chatbot is ultimately a member of the object. In this case ~chatbotlist is a list of synonyms for chatbot.

And while I haven’t bothered to code it yet, one could take this even further…

~books ~authors book xxxx

I could write script xxxx (where xxx is actually script and not a sentence) that would look up a specific author, find from facts of books that he wrote, and respond by randomly picking one of his books.

That is, having stuff in multiple tables allows you to interact knowledgeably with it. Unlike writing rules where the answer is stuck in a specific rule and you can’t do anything else with it.

Posted: Jun 14, 2012

[ # 1 ]

Ed UF

Experienced member

Total posts: 41

Joined: May 18, 2012

E-mail Ed

I would like to print or save the results of a query like the above favorite-book table (with nested facts), how would I do that?
Given
query(direct_vo ? favorite _1 )

I did
_6 = ^first(@0all)
_6 _7 _8

and as expected _6 gives a number. So I tried
^fact(6 subject) ^fact(6 verb) ^fact(6 object)
but that returns words like Ariel homophone areal, which I don’t know where they came from.

BTW the manual, under Facts of Facts, states “You can decode that by using ^fact( 3 subject) or ^fact(_3 verb) or ^fact(_3 object) ...”. The two later cases with _3, is that a typo? There’s also “_1 = ^last(@1) ” where maybe a + or - or all is missing?

Thank you!

Posted: Jun 14, 2012

[ # 2 ]

Bruce Wilcox

Moderator

Total posts: 2372

Joined: Jan 12, 2010

E-mail Bruce

the typo in the manual is the FIRST one
^fact(_6 subject) would be correct form, not ^fact(6 subject). You need to pass the fact id currently stored on the variable to the function.

or you would be looking for ^unpackfactref

(from the manual):

^Unpackfactref examines facts in a set and generates all fact references from it. That is, it
lists all the fields that are themselves facts.
@1 = ^unpackfactref( @2)
All facts which are field values in @2 goto @1.

Posted: Jun 14, 2012

[ # 3 ]

Francis Wang

Experienced member

Total posts: 52

Joined: Apr 9, 2012

E-mail Francis

Thanks for the examples on tables. Tables are a fantastic way to isolate knowledge from language patterns. Follow your examples, I created a table and a query, but could not get any result.

==========================================================
Table: user-feeling (^emotion ^result)
^createfact(user-feeling ^emotion ^result)
DATA:
~feeling_happy I_am_glad_you_are_in_a_good_mood.
~feeling_angry Don’t_be_upset.
~feeling_sad Don’t_be_sad._Be_Happy!
==========================================================
^query(direct_s<v user_feeling happy)

I was hoping that ‘happy’ would be matched to the concept ~feeling_happy, but it just returned
an empty set. Is the syntax for ^query correct?

Thanks

Posted: Jun 14, 2012

[ # 4 ]

Bruce Wilcox

Moderator

Total posts: 2372

Joined: Jan 12, 2010

E-mail Bruce

You really want the anwwer to this? There are LOTS of issues.

First and foremost, you store facts using “user-feeling” and retrieve them using “user_feeling”. Not the same.

Not that I know why you want this as a table instead of a rule…. but that’s your business I suppose.

Esoteric queries have less documentation in the manual, and exist primarily in queries.txt.

According to the documentation there about direct_s<v, it will scan up from the subject value in the
query and mark all words found. It will then find all facts with the VERB argument of the query,
and locate which ones have marked subjects. THEREFORE your use is incorrect.

s< means fan out from subject. user-feeling is not a word and will NEVER have facts above it,
so using direct_s<v has no real meaning for scanning.

What you WANTED was the happy was scanned up, marking ~feeling_happy. So you needed the facts to be

==========================================================
Table: user-feeling (^emotion ^result)
^createfact(^emotion user-feeling ^result)
==========================================================

and the query to be

==========================================================
^query(direct_s<v happy user-feeling)
@0object
==========================================================

That works.

Posted: Jun 15, 2012

[ # 5 ]

Francis Wang

Experienced member

Total posts: 52

Joined: Apr 9, 2012

E-mail Francis

Thanks for pointing out my mistakes. I realize now that you can probably do it directly as rules. However, I still would love to understand ^queries better, as I think having concepts within tables is fantastic. I am not really getting the documentation in queries.txt, so I am just trying to guess how it works.

==========================================================
Table: user-feeling (^emotion ^result)
^createfact(^emotion user-feeling ^result)
————————————————————————————————————————————————————
^query(direct_s<v happy user-feeling)

This works great. However, if I had the table as:
==========================================================
Table: user-feeling (^emotion ^result)
^createfact( user-feeling ^emotion ^result)
————————————————————————————————————————————————————
Is it possible to formulate a query?

I think what I need in this case is to scan from the verb. I tried:
^query(direct_sv< user-feeling happy)

But that has illegal syntax.

*** One more question ***
In general, does using concepts within a table incur a greater performance hit than simply using rules?

Thanks

Posted: Jun 15, 2012

[ # 6 ]

Bruce Wilcox

Moderator

Total posts: 2372

Joined: Jan 12, 2010

E-mail Bruce

so.. first, why do you resist using user-feeling as the fact of your verb, since it clearly works?

second, queries are a mini-programming language (and quite terse) which is not well documented. I have never defined direct_sv< though I could, because once I have one order that worked, I didn’t need another. Queries allow you to traverse an arbitrary graph formed by facts. For simple queries, they are like SQL. for complex traversals, they are much more efficient than SQL could be.

there are 3 mechanisms available to find answers:
1. rules in a topic.
2. queries
3. an idiom table (though not yet available for output, it could be made such). An idiom table is analogous to rules, in that it will execute patterns.

#1 & #3 would cost similar amounts of cpu. #2 depends on the query which can traverse arbitrary numbers of facts in complex ways. In GENERAL, queries will be more expensive, though it may or may not be noticable. That’s because they may have to pass thru an arbitrary number of facts to accomplish the result.

For example, in your case, if you have a rule
u: (~feeling_happy) message about being happy
Then the cost is a) getting into a topic if the rule is not part of a topic you visit anyway (cheap)
Setting up the rule (cheap), executing the pattern (cheap).

For the corresponding fact notation, the setup for THIS query is probably similar to the setup for topic+rule, but
it will have to scan thru all facts derived up from happy (some 30 facts) to mark them. This overhead is paid by the query, but is “free” to a rule since it is performed once on the input sentence for the entire rule system.

Then the query will look at all facts on your user-feeling to check for marked subjects. Since you didnt limit the query to 1 answer, it will look at all 3 or 4 facts of yours even if it has an answer because it doesn’t know it is the ONLY answer (you DO know that) so you could limit it to 1 answer.

Ergo, queries are slower than rules. (BUT everything is really fast in general).

Posted: Jun 15, 2012

[ # 7 ]

Bruce Wilcox

Moderator

Total posts: 2372

Joined: Jan 12, 2010

E-mail Bruce

The basic operations of the query system are:

to mark words with a distinctive mark
to queue words whose facts you will analyze
to fan out from a word (marking or queueing or both) typically up via member and is facts but its arbitrary
to test a fact for the presence or absence of a distinctive mark on any field
to queue facts as answers or for further analysis

These are done in various combinations

We mark words we WANT to see in a fact. We can mark words we DONT want to see in a fact.
So boring queries look for all facts whose subject and verb are directly given.
Mildly interesting queries look for all facts whose verb and object are given (e.g. ? examplar France) and for those facts see which facts of the subject have a given verb/object (e.g. member ~capital) and store those as results. That query finds the capital of France by looking at all things in France and seeing which are members of ~capital.

Posted: Jun 15, 2012

[ # 8 ]

Francis Wang

Experienced member

Total posts: 52

Joined: Apr 9, 2012

E-mail Francis

Thanks for the quick response. I think I got it now. I was using “user-feeling” as the subject because in some cases, I use the subject-verb-object triplet as a way to represent relationship where ‘subject’ is the relationship name and I wanted to be consistant.

This thread is very informative and I look forward to more of this type of (best-practice) discussions.

Posted: Jun 15, 2012

[ # 9 ]

Bruce Wilcox

Moderator

Total posts: 2372

Joined: Jan 12, 2010

E-mail Bruce

Got it. Best practice discussions will depend on someone asking an appropriate question.

In general, where there is an asymmetry in performance on queries, it will presume the VERB has few choices and inheritance to scan and the subject and object fields have many. That is, is “assumes” English SVO order of sentences, and since NOUNS have complex hierarchy relationships and there are more nouns. So I generally put the “verb” or the “lesser complexity” item in the verb position of the fact. In fact, the system IMPLICITLY quotes the verb field, so it cannot wander. (when you supply a concept name to query, it will OFTEN mean mark every member of the concept and you have to quote it if you dont want that behavior. The verb is implicity quoted. So behavior of the system is not the same for all fields.

Posted: Jun 15, 2012

[ # 10 ]

Bruce Wilcox

Moderator

Total posts: 2372

Joined: Jan 12, 2010

E-mail Bruce

In the user manual, I have generally said that the “recommended” way to organize facts is
SUBJECT is the subcomponent VERB is the relationship and OBJECT is the superclass. That’s because that order is ALREADY forced upon you by the internal system facts for concepts (MEMBER) and dictionary (IS). So keeping that notion means you don’t get confused about which way things are stored.

Similarly, when you supply multiple fields values, the system must decide which field to use to find facts from. It prefers SUBJECT to verb, because it presumes the facts of a particular noun will be fewer than facts on a more generic verb. eg (dog like cat)
it assumes there will be fewer facts with dog as a subject than there will be with like as the verb.

‹‹ RiveScript vs. ChatScript? Randomization and the Neuron ››

Search the Forum

Forum Profile

Forum Subscription

Forum Moderators

On Our Admin Forums

Partner Forums

Science Statistics

Chatbot Statistics