AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Topic systems.
 
 

Hey guys, I wanted to discuss the various topic systems and not necessary topic design or how conversations are laid out.

ChatScript, AMIL and SuperScript/RiveScript all support topic systems and it was has been said that the Topic System ChatScript uses gave it the edge this year in the Hugh Loebner Prize.

RiveScript and SuperScript have an explicit topic system and the author has to manually direct the conversation into each topic, while ChatScript uses other magic to navigate from topic to topic.

There are advantages to both approach, but having a more rigid system leaves a lot of gambits outside of flow (or puts more work on the botmaster to author the gambits to work better for certain use-cases). I have extended RiveScript to include pre topics and post topics that can wrap the main topic if their are no matches.

I’m hoping Dave and Steve can jump in and explain some of the inner workings or ideas behind ChatScript / AMIL.

Ideally a system that can easily handle Loebners Screener questions which would jump from topic to topic AND more deeper conversation diving into a single topic seems to be worth striving for.

I was thinking perhaps building a TF-IDF document for each topic and search for the best topic based on the input nouns, keywords or some other metric.

If this has been discussed before, I apologize in advance.

Rob

 

 
  [ # 1 ]

Rob,
most of the AIML systems I have seen use an explicit topic selection system (botmaster sets the topic trigger in the category), often in the ‘think’ section.

http://www.pandorabots.com/botmaster/en/tutorial?ch=4

For implicit topics (chosen on the fly by the input), you might be able to get away with only a keyword based system.
TFIDF might be overkill.
KEYWORDS->TOPIC

ChatScript has an Noun/Verb Ontology which is basically a set of keywords.
topic: ~AERIAL_SPORT (glide parachute skydiver static_line ~aerial_sports )

 

 
  [ # 2 ]

ChatScripts topic mechanism is you associate words or collections of words (concepts) with a topic. The engine can match sentences to possible topics via most matches and longest matches (not counting rejoinders which are rules firing off the most recent rule). You can write any control structure you want, the default one tries to find rejoinders first, responders of the current topic (s) second,  responders of the most matching topics next , quibbles, and finally any gambit from the most matching topic, and then gambits from the current topic. The ontology is merely a way to predefine a collection of concepts.

 

 
  [ # 3 ]

Thanks Bruce, and Merlin.

So I spent some time tonight looking though the chatscript code, and your description makes a lot of sense. Naturally rejoinders, than responders (AMIL thats, RiveScript previous) first, but it breaks down after that. “responders of the most matching topics next”. Does this mean topics that have more than one matching trigger (rule) in a given topic?

Perhaps a sidebar here into topic design. I’m assuming you would want to have a wildcard match in each topic? But that might not yield the best answer to question from another topic. Is that best practice?

Quibbles, the way I see it implemented in 4.8 seems like a good way to catch off topic questions, and people testing the strength of the bot, but those would need to be tested before and wildcard gambit in the current topic.

Let me phrase the question differently, if you could build the system from scratch again, what would you do differently or what would you add/change?

And have you given any thoughts to embedding state into the topics? I’m toying with the idea of being able to filter out topics (and rules / replies) by adding a custom truth function/test.
Some use-cases I see here would allow unlocking topics based on conversation length or other meta data picked up in other topics. Here is an example of how I’ve implemented it on the reply level.

+ * my name is <name>
- {^hasName(false)} ^save(name,<cap1>) Nice to meet you, <cap1>.
- {^hasName(true)} I know, you already told me your name.

I get the ontology, it is used heavy for a topics, triples of facts and concepts extending wordnet. Very cool.

 

 
  [ # 4 ]

“most matching topics” means if there are multiple keywords in a sentence from a particular topic, it will be preferred over a topic with fewer matches in a sentence (usually only 1 keyword will match).

Having a wildcard match at the end of responders in a topic is extremely dangerous, since it will hijack most anything. I actually divide my topics in fragments, where the 1st fragment of responders are very specific, and when it fails to match the system moves on to try other specifici fragments. A different fragment covers more general responders. This is done by this structure:
?: ( specific pattern 1)
?: (specific pattern 2)
u: (*) ^fail(topic)
?: GENERAL (general pattern 1)

and then the control script for the 2nd pass gathers the set of topics that match keywords again, and executes a labelled respond, which starts responder execution at the label.
One can already have state for topics.  You can create facts and variables. So, for example, if the user says “this is boring” the system will create a variable whose name derives from the topic, and stores the volley count of the current volley + a distant future, like 500.  Then when the system is looking to gambit in a matching topic, it first checks the lock variable for it and if it has one and it has not expired, it will ignore the topic for now.  This doesn’t block responders, just gambitting.  And gambits, like responders, can have patterns, so you can write
t: ($hasname) I know your name

 

 
  [ # 5 ]

So in ChatScript, are all possible topics that a chatbot could pick up on, predetermined by the creator? So in practice one would likely have included food, work, hobbies, etc, but if a user would want to discuss the finer points of quantum physics and the creator did not foresee this, it would not be recognised as a topic of conversation?
Quite interesting.

 

 
  [ # 6 ]

sort of by definition, chatscript is “script” that you write. So you define the topics. One can have CS go out and do a google search to do a responder, but that’s a one-off, not a topic.  There is no real intelligence here. Nor self creation. It’s just programming.

 

 
  [ # 7 ]

Hmhm. I see smile. I asked because I don’t usually consider manual programming of contents, but this seems very reliable in its method. I’ve been doing things the automated way, more akin to document summarization techniques, in which one only counts and assumes that the most frequent noun in recent conversation is the main topic. e.g. if one mentions the word “armadillo(s)” a lot. (this has issues when suddenly switching topics though)

Recently someone on an NLP subforum asked about the following problem, and I’ve been thinking of ways to do this through ontology knowledge that are, in principle, similar to ChatScript’s topic matching method. The amount of ontology searches required would however be excessive in comparison, or at least if one doesn’t have a hierarchical ontology.

For example, from a cluster “fish, bread, apple, beef” I want to infer “food” as a summarization of the cluster.

 

 
  [ # 8 ]

certainly that would hit highest on
topic: ~food (nutrient~1 ...) 
where it would match all 3 words for a high score.

 

 
  login or register to react