AI Zone: chatbots.org

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

This year’s qualifying questions.

Posted: Apr 16, 2012

[ # 16 ]

Victor Shulist

Senior member

Total posts: 974

Joined: Oct 21, 2009

E-mail Victor

Thunder

I think the topic of the questions is utterly irrelevant. I don’t care if a bot only knows about ‘adult topics’, or only about electronics, greek methodolgy or just, hey, keep me company with some casual chat.

But if I read you correctly, I agree, unless your project employs a half a million people entering data from every conceivable human endevour, it is a bit pointless to test the bot on too wide and range of topics.

I’d rather have a bot that has powerful language skills , and can learn and acquire more knowledge, rich knowledge, via natural language interaction, but starts with not knowing anything, or very little knowledge, than a bot that has quick answers to millions of things, but doesn’t understand any of them enough to be further questioned on its response.

In other words it is much more valuable to initially not know much, but know how to learn, than to ‘fake’ knowing many things, understand nothing really, and unable to learn via strong language skills.

Thus I tend to agree - I’m not going to worry, initially, about my bot knowing the purpose for a hammer. I’m going to worry more about it being able to learn, using core language skills, that a hammer is used for pounding in nails.

If they want to judge a bot by such a variety of things, then forget about the contest—we have a winner, WATSON. Watson is a huge oracle of knowledge. BUT, its not a chatbot. You make a request question, and it gives you closest match. There is no variable number of levels of indirection of language based reasoning and skill involved. It can’t be given natural language statement S1, then deduce S2, and S3 then use S3 to respond to your next question. Instead , it is just a “input - output” pair.

If I was entering the contest, here is how I would judge the questions…

My name is Bill. What is your name?

Fairly good input. Your bot has to know how to segment this to realize it is a statement followed by a question. Needs to be able to store the user’s name and provide its own name.

How many letters are there in the name Bill?

Cute - I guess a bot could know this, but very low on importance scale. Why on Earth would someone want to ask that? Shows some language skill yes, but rather stupid - there are much, much more interesting and difficult tests that could replace this.

How many letters are there in my name?

Not bad, a bit stupid still though. However, does show the functionality of the bot realizing it has to first dereference “my name”, then go back and rewrite the question and resubmit to itself. So not bad, but the “how many letters” is stupid- who would ask that? A bot could know that, but I suggest that be super low priority on anyone’s bot project

Which is larger, an apple or a watermelon?

Meh…. pretty pointless and doesn’t test any advanced language skill. Perhaps just for fun, or if the bot is to be used by perhaps a child, they may think it’s fun.

How much is 3 + 2?
How much is three plus two?

Give me a break. build a mathematical expression parser to do some pre-processing, plug in the values, then continue executing your more important NLP… yeah sure, waste of time though. should be super-low priority on any bot project.

What is my name?

Simple effective memory recall test. Good test.

If John is taller than Mary, who is the shorter?

Should read “who is the shortest”....Reasonable language test and relationship determination. Good test.

If it were 3:15 AM now, what time would it be in 60 minutes?

Very good test. This is getting advanced. Ability to realize the input is and if-then, determine on its own the validity. Very good.

My friend John likes to fish for trout. What does John like to fish for?

Very good test. Basic language skill. Notice we must realize the verb conjugate ‘like’ versus ‘likes’

What number comes after seventeen?

stupid

What is the name of my friend who fishes for trout?

Very good. First has to realize it must resolve “who fishes for trout” to an actual person name, plug it in to rewrite the sentence, and resubmit to itself. Steve’s bot got this, but as I said it will take a bit more work for it to NOT be followed by stuff like “My friend bob does NOT like to fish for trout” It should know that it cannot say bob. Also, even better would be

My friend Sam loves to go to ACDC concerts.
...
...
Who likes to attend ACDC concerts?

(also try with “does” and “doesn’t” to fool it).

This would still involve the language skills as the original, but also know when to consider to words as synonyms (‘loves’ vs ‘likes’, and ‘attend’ versus ‘go to’). I’ve done a lot of work and made some great progress with this type of thing with Grace.

What whould I use to put a nail into a wall?

Uh… not a bad idea I guess. Like you say Thunder, it is a weird thing to ask. On the other hand, the bot I guess should start of with some basic knowledge of the world. And a hammer is a pretty common basic tool. Not very important though.

What is the 3rd letter in the alphabet?

dumb.

What time is it now?

Sure.

What should be done is - figure out what type of things you are testing about the bot—is it world knowledge, is it complex lanague understanding skill, etc. Then, for each of those categories come up with say 1-3 questions that target that functionality and weight them. Knowing what the 3rd letter of the alphabet should be low priority, a hammer usage, perhaps higher, but still pretty low compared to being able to first resolve “who likes to fish for trout” and be able to evaluate the input in a multi-stage fashion.

By and large, I agree with Thunder, “how many letters in the word ‘car’ ” are pointless, because you could go to unbelievable levels

Posted: Apr 16, 2012

[ # 17 ]

Victor Shulist

Senior member

Total posts: 974

Joined: Oct 21, 2009

E-mail Victor

of trickery that I’m sure no bot will be able to cope with in our life time…..

image pulling a trick like this on a bot….........

user— I want you to, for every input I provide, count the number of words in my input and divide it by the product of each of the ASCII values of the first letter of each word

mahahaha…. seriously, with enough depth of trickery, no bot will ever pass a TRULY NASTY Turing Test

Well, that is, unless there is a Manhattan Project for AGI with billions of dollars and ten’s of thousands of people working on it.

But who cares… let’s get our bots talking about common things, and able to carry a conversation, and not be able to handle any amount of trickery, because, as you can see, it is totally futile, since anyone can dream up an even more nasty example.

Posted: Apr 17, 2012

[ # 18 ]

Laura Patterson

Senior member

Total posts: 250

Joined: Oct 29, 2011

E-mail Laura

I wrote a much longer post but it got lost in space, so I will summarize my thoughts instead.

- Tricks and spelling bees are pointless to test useful AI

- The industry is rapidly changing as is the technology.

- The focus needs to be on useful applications for conversational AI

- There are big differences when it comes to fun and open topic Chatbots as compared to purpose driven Virtual Agents.

- I believe the CBC was discontinued because it had become outdated over the last 10 years. It’s was however very useful for testing concepts and a great learning experience especially for new bot masters. Wendall knew the time had come to retire the CBC in favor of a more relevant competition.

That was pretty much what I wanted to say in an egg shell.

Posted: Apr 17, 2012

[ # 19 ]

Laura Patterson

Senior member

Total posts: 250

Joined: Oct 29, 2011

E-mail Laura

Laura Patterson - Apr 17, 2012:
I wrote a much longer post but it got lost in space, so I will summarize my thoughts instead.

- Tricks and spelling bees are pointless to test useful AI

- The industry is rapidly changing as is the technology.

- The focus needs to be on useful applications for conversational AI

- There are big differences when it comes to fun and open topic Chatbots as compared to purpose driven Virtual Agents.

- I believe the CBC was discontinued because it had become outdated over the last 10 years. It’s was however a very useful tool for testing concepts and a great learning experience especially for new bot masters. Wendall knew the time had come to retire the CBC in favor of a more relevant competition.

That was pretty much what I wanted to say in an egg shell.

Posted: Apr 17, 2012

[ # 20 ]

∞Pla•Net

Guru

Total posts: 1297

Joined: Nov 3, 2009

E-mail ∞Pla•Net

Victor,

Thanks for helping to analyse some of these questions.

My name is Bill. What is your name?
How many letters are there in the name Bill?
How many letters are there in my name?
What is my name?

The idea may be…

(your|the|my) name (Bill)

Could be a bit tricky to build a bot to answer just these questions because the
word “name” is in between (your|the|my) and Bill which is not always there.
So it is not so easy to find “Bill” to get the string length of it, especially with it
spread out among the questions.

Posted: Apr 17, 2012

[ # 21 ]

Jan Bogaerts

Senior member

Total posts: 697

Joined: Aug 5, 2010

E-mail Jan

8pla: that’s why writing a pattern like:


~subject ^v:verb.be ~object

works much better for these types of things.

@Steve: Yep, you’ve also had some serious bad luck with this competition. I don’t really understand why none of the AIML bots got tested.

@Thunder: you have a point, but it also strikes me that you have problems with the things that AIML has problems with. Is this a coincidence?

Posted: Apr 17, 2012

[ # 22 ]

∞Pla•Net

Guru

Total posts: 1297

Joined: Nov 3, 2009

E-mail ∞Pla•Net

Jan,

Nice pattern! It helped me find an error relating to grammar in the question:

My name is Bill. What is your name?

There is supposed to be two spaces after the period, not one space!

Does that error make it a trick question with two punctuations

that technically, have no right to be there, in one line?

Posted: Apr 17, 2012

[ # 23 ]

Victor Shulist

Senior member

Total posts: 974

Joined: Oct 21, 2009

E-mail Victor

Laura Patterson - Apr 17, 2012:
I wrote a much longer post but it got lost in space, so I will summarize my thoughts instead.

Yep, I write my thoughts in a notepad window then copy and paste….. it has happened many times, so I just laugh, and re-paste and resubmit the content again hahaha…. Erwin & Dave - you guys have to re-test and find this bug, it happens to many members!

Laura Patterson - Apr 17, 2012:
I wrote a much longer post but it got lost in space, so I will summarize my thoughts instead.

- Tricks and spelling bees are pointless to test useful AI

Absolutely agree !!!!! A pointless waste of computational resources, time & energy… to cater to people that can’t try harder. I see a day when AIs talk between eachother, and there is ZERO wasted energy, since there will never be a need to check for spelling errors. So why waste time on that stuff now, when the goal should be developing true understanding in our bots instead of wasting develop time and run-time resources for people that choose to be lazy?

As for what qualifies as a ‘trick’ , that could be very subjective. For example, I myself don’t think

My friend Sam likes to go to ACDC concerts.
Who loves to attend ACDC shows?

is a trick. ‘loves’->‘likes’, ‘attend’->‘go to’, & ‘concerts’->‘shows’.

or is it? I’m at the point in my project of how the bot should know, under what circumstances, or contexts should these words be considered synonymous. Or should it always indicate ‘well, I don’t know if he LOVES to go to ACDC concerts, but you said Sam *likes* to go to ACDC concerts’.

Human speech and text is utterly littered with this type of thing - humans are exteremly inconsistent.

Posted: Apr 17, 2012

[ # 24 ]

Jan Bogaerts

Senior member

Total posts: 697

Joined: Aug 5, 2010

E-mail Jan

I’m at the point in my project of how the bot should know, under what circumstances, or contexts should these words be considered synonymous. Or should it always indicate ‘well, I don’t know if he LOVES to go to ACDC concerts, but you said Sam *likes* to go to ACDC concerts’.

I think humans actually do both: at one side we know the similarities but also the differences. When someone ‘adores’ something, they like it + some extra. It’s like big, bigger, biggest,...
That’s why I believe thesauri are so important: they provide this ability to relate words.

Nice pattern! It helped me find an error relating to grammar in the question:
My name is Bill. What is your name?
There is supposed to be two spaces after the period, not one space!
Does that error make it a trick question with two punctuations
that technically, have no right to be there, in one line?

Thanks, it’s one of the fallback patterns I currently have, very very simple, but it works.

I’d agree with David Hamill over on the robitron list that there has been a shift from 2 to 1 spaces lately. Also, I think for ‘chatbots’ generally, spaces shouldn’t play much of a role (in the input at least), it’s white noice that needs to be filtered out. So if it’s 1, 2 or 3 spaces, tabs or enters between words, it shouldn’t matter.
That said, there are plenty of parsing types in which spaces do play a role, so it’s one of those tricky things that tend to make it difficult for making a single ‘catch-all’ pattern matcher algorithm.

Posted: Apr 17, 2012

[ # 25 ]

Thunder Walk

Senior member

Total posts: 399

Joined: Feb 7, 2009

Laura Patterson - Apr 17, 2012:
I wrote a much longer post but it got lost in space

Happens to me all the time. I can’t even get this forum to Preview any more. I usually save my postings to an email or text file before I try moving away from the reply page.

Laura Patterson - Apr 17, 2012:
- There are big differences when it comes to fun and open topic Chatbots as compared to purpose driven Virtual Agents.

I agree, but there are also many similarities. For example, I’d think that a Virtual Agent or an automated web page host would be more interesting and productiveif it could respond in the conversational way a chatbot would, emulating a human in their answers, rather than seeming like a robotic phone answering menu.

Laura Patterson - Apr 17, 2012:
- I believe the CBC was discontinued because it had become outdated over the last 10 years. It’s was however very useful for testing concepts and a great learning experience especially for new bot masters. Wendall knew the time had come to retire the CBC in favor of a more relevant competition.

I always thought the CBC was an outlet for people and bots who weren’t able to stand up to the Loebner, although, if you read past Loebner transcripts, you wonder why anyone ever entered the competition.

Jan Bogaerts - Apr 17, 2012:
@Thunder: you have a point, but it also strikes me that you have problems with the things that AIML has problems with. Is this a coincidence?

I’m not sure I understand you, Jan, I don’t recall mentioning (in this thread) having problems… but, yes, AIML isn’t perfect and I have a few complaints. Most of them have to do with words “hard coded” that I’m unable to change.

I frequently read in this forum where people talk about how one day AI/bots will gain “understanding”. That they will have the ability to grasp, or somehow comprehend things such as concepts or theories. I don’t think that will ever happen. Bots need to understand a word, or a question to respond, but at best, all they will ever be able to do is fake it. Even WATSON didn’t understand what it was answering.

Posted: Apr 18, 2012

[ # 26 ]

Laura Patterson

Senior member

Total posts: 250

Joined: Oct 29, 2011

E-mail Laura

Thunder Walk - Apr 17, 2012:

Laura Patterson - Apr 17, 2012:
- There are big differences when it comes to fun and open topic Chatbots as compared to purpose driven Virtual Agents.

I agree, but there are also many similarities. For example, I’d think that a Virtual Agent or an automated web page host would be more interesting and productiveif it could respond in the conversational way a chatbot would, emulating a human in their answers, rather than seeming like a robotic phone answering menu.

I am so glad you pointed this out! That is exactly why I first created My Marie. I did not expect her to do well in the CBC, but I wanted to use the opportunity to test my parsing engine (which I fixed many bugs) and also worked on the personality first and the knowledge base later. Slide Agent is the commercial product and the customer can customize the responses and personality to fit their business needs. The main focus I had in the development process was to make the engine extremely flexible in its ability to parse the users string query efficiently so the customers responses could be matched accurately. Personality is just a matter of wording the responses in a less robotic fashion and really has nothing to do with the core technology involved.

A Virtual Agent needs to do more assist visitors with verbal responses. An effective Virtual Agent needs to also perform useful tasks such as web site navigation and assistance with form validation and sales completion. I do believe we will see more Virtual Assitants in the coming months especially on the mobile platform.

< 1 2

2 of 2

‹‹ Platform Question Concerning Browser Based Bots. 2012 LIVE webcast and finalist announced ››

Search the Forum

Forum Profile

Forum Subscription

Forum Moderators

On Our Admin Forums

Partner Forums

Science Statistics

Chatbot Statistics