AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Narrowing of Loebner Prize
 
 

I have read the results of the Loebner prize examinations between the judges and the terminals, and I noticed that the questions and anmswers so far were all revolving around the issue of health.
Am I to assume that the examination of a specific type of Chatbot may indeed revolve around a single topic?
This narrows the turing test in scope a bit, dosen’t it?
And Congratulations to Robby for having won one of the prizes.
Peetee le Trickfox.

 

 
  [ # 1 ]

(as your enquiry is about Loebner prize, I’ve split the thread).

As far as I know the Loebner prize still is focussed on general conversations, covering any topic.

 

 
  [ # 2 ]

There is no restriction on conversations.  I don’t think that Health has been the only topic discussed.

 

 
  [ # 3 ]

Thank you Raymon!  With the more recent contests, where the bot and human conversations are viewed side by side, there is very little mystery left as to which is which.  In a world where all we seem to have are chatterbot competitors, there is to me no fun in the contest. In fact, it almost seems pointless.  Perhaps Hugh should screen entries until he finds one suitable for side-by-side comparison, before holding the contest again. Otherwise, it really seems like a waste of time to me.

 

 
  [ # 4 ]

1.  I made a pledge to the A.I. community that the contest would be an annual event, and I don’t go back on my word.
2.  Waiting until I got a “good” entry is futile; no one would submit anything.
3.  The object of the Loebner Prize is not to have fun, it’s to provide a venue wherein anyone developing an intelligent agent can submit an entry.

 

 
  [ # 5 ]

I am a proponent of the contest, don’t get me wrong. I was encouraged by Huma Shah’s recent remarks about the Italian students who miscategorized bots as confederates and visa versa. It really should be fun, and to me, the mistakes made by judges are the most fun.

It is an annual event that I look forward to, whether involved with it or not. I like to hear what the latest news is.

Robby.

 

 
  [ # 6 ]

I think the Loebner prize was and still is a great idea.  Since I was 14 years old I have dedicated my life to creating an intelligent agent that can converse in natural language.  The idea has always fascinating me, and I know always will.  I work on my project for pure interest in the subject matter.  But when I found out about the Loebner contest that just ‘AMPED UP’ my efforts that much more.  The contest does a lot to keep efforts going in A.I. research.

 

 
  [ # 7 ]

Hello

I was just wondering why the contest dont has different flavours, not only to fool a judge but to get a nice conversation at any open theme.

Why not introduce specific topics ot themes, like getting a product selection when the user has a specific need, even to get some help or advice about some real thing, product or service, not a wholy chatter blah blah

I mean this because I am thinking a real humanlike IA will be far away in time to fool a human, but useful chatterbots will help in many indusrty sections and markets. And the prize may help the industry to drift into a useful direction.

I am shure that the goal of a conversation and the quality of the bot in dealing with the whole conversation mechanics are more important for any company than to fool the user he is talking to a human. The user should know it and even thoug feel comfortable with the bot. This may be a useful hint.

Also I saw and tested even the most daring and prize-winner bots, and they dissapointed me a lot. Their conversation had no purpose other than to fool me and maintain my attention to drift to whatever the bot has more answers. The conversations were all very stupid and of a type like question-answer-factic, no ellaboration, no memory other than a few words or topics, repeated with pre-built phrases, no conversation thread following capability, no… nothing human at all. I could test many of them, even the ones who are ´commercial-grad´ like the ones made by virtuoz for the star alliance and this one was very good, but had no capability to understand other than a few keywords and pre-built phrases, i tested him a lot, and it kept answering allways the same nonsenses to the same queries.

I would like to see bots with Language Generation Capability, with inventive to handle situations on the conversation, like a long silence, too much nonsenses as input, repeated questions, insult andling, or silly answers detection and even facing stupid or grotesque words detection, a bot who could read like a human, and see through text thing like: hhhhheeeeellloooooo, how are youuuuuu!! or detect that jnsdjkbcgsyughdks hfjd kkihd are no words at all!

Also, let me be crude: English is a piece of cake, to detect patterns, why not try with a higly inflected language like spanish, polish, french, italian or potuguese ?

Also in my own experience AIML is not (at leas simply) capable to handle Spanish in an efficient way, I coud not make it work decently at all, many years ago.

I am personally making all the pieces of - complex stuff - to handle even a small part of the inflected Spanish, and its a hell of a complexity, there is a real need for brilliant programmers to bring a chatterbot to speak spanish, not just to sit behind a stemmer like snowball! to simplify up things. (like some of the ones I´ve tested out)

Some of the challenges I´ve faced:

- In English you can get a deep & rich ontology like wordnet, for free smile

- In Spanish there is not! :(

In Spanish there are more verbs and verb-phrases combinations than phrasals in all England and US territory
There are some scientific papers stating the ammount of inflections needed to understand Spanish, ad the number is simply huge over 9 zeros and counting!
Also there are too much prefixes and suffixes used in Spanish on all nouns and adjectives, I collected about 900 common prefixes and over 800 suffixes, many of them are also capable to get infixed, so the combinations are for each word circa a million (to be exact, not all words admit all prefixes and suffixes, but at least a third of them have a affordable semantic meaning to build parasynthetic words, so we may have more than 10k variations for each noun or adjective, and in a typical Spanish dictionary we have 40k nouns, so 100M noun variations may be a good estimation. For verbs we have 70 inflections (tense, person, number, mode, gender) and in verbs the prefix liest may be fully useful, because they normally modify the semantic meaning of the verbs, then we have 900x70, ad we got 30k verbs (basically) then we got 300M verbal forms. The adverbs are about the same as verbs, adn you can build adjectives out of some verb variation-inflection (derivations) so you got a huge and wild bunch of word-forms out there to deal with.

This may be also a challenging thing to deal with and do something useful around this.

Even the spell corrections (mistyping) should be handled by a bot, because humans do this very well indeed!

I saw no bot capable of dealing with more than one letter mistake! (all of the bots missed the keyword, when a mistake arouse just there!)

Even more, why a bot cannot manage word-similarity parameters? like humans, or even google with its huge statistic rich human feeding source, getting an almost astonishing: did you mean xxxxx?

Dr Loebner:

Those are questions and things to add to enhance a challenging contest like the Loebner Prize, and the outcome shold be many scores, like
- best spell.checker and bad writing interpreter
- best turn-taking managing-bot (happiest human experience)
- best ´intelligent´ bot to understand and defeat elegantly silly inputs
- best and smarter goal follower bot
- best helpful hinter bot (for a product hint or advice)
- best service technician/advisor bot (for thechnical support)
- best ´seller´ bot (for a offering smart-selling bot)
- best humorious bot, for a bot who could take any input and create real jokes (not pre-written)
etc-

and each of those categories should be available in any language
I believe that at first time there will be no participants in many languages, but there will be a statistic on how much is done in any language as time goes by, this will give an impulse to other bots engines, in other languages.

This is just a purposal to enhance the challenge

best regards

Andres H

 

 
  [ # 8 ]

In Spanish there is not! :(

spanish wordnet

I was just wondering why the contest dont has different flavours, not only to fool a judge but to get a nice conversation at any open theme.

Why not introduce specific topics ot themes, like getting a product selection when the user has a specific need, even to get some help or advice about some real thing, product or service, not a wholy chatter blah blah

I mean this because I am thinking a real humanlike IA will be far away in time to fool a human, but useful chatterbots will help in many indusrty sections and markets. And the prize may help the industry to drift into a useful direction.

Check out the CBC contest


Why no enter your bot next year. Then you can show us how to do the stuff you are missing in other bots.

 

 
  [ # 9 ]
AndyHo - May 2, 2011:

Also I saw and tested even the most daring and prize-winner bots, and they dissapointed me a lot. Their conversation had no purpose other than to fool me and maintain my attention to drift to whatever the bot has more answers. The conversations were all very stupid and of a type like question-answer-factic, no ellaboration, no memory other than a few words or topics, repeated with pre-built phrases, no conversation thread following capability, no… nothing human at all.

Perhaps you could put up an example of your own bot for us to test?

 

 
  [ # 10 ]

Jan & Steve

Thanks for the reply, but I have to say that I may not have checked all off the bots out there (there are more than 800) but many of them (I tested) are based on the same AIML machine, I test them always with some typical phrases and get the same ‘typical’ result here and there, so until someone points me out “that” bot is different I’ll not test all in my spare time.

Yesterday I tested many of them in Spanish, they answered almost the same every time: Ariel, Claudia, Mr Testis, Marina, Diego etc.. (all are based on the same twisted AIML engine called botGenes200..)

Then I tested Mitzuky (English) and she was more challenging, only capable of short word emotional dialog, like ejem, ok, hmm, I See.. etc. She followed some threads for 3-4 turns with something I’ve said, (she throw it into my face aggressively thereafter) but when it comes to repeat yourself, the answers are just hard-coded on his detection, no one answered a different thing, while I kept typing a wrong keyword, only one of the Spanish bots (Diego) asked me (always the same thing) when I mistyped an accent mark, a very frequent thing in Spanish, where there are controversies also to apply or not the written accents (diacritic marks) promoted by every-changing rules by the Spain Royal Academy of Letters (DRAE)

Also I didn’t say my bot is better, I never wrote a whole bot, instead I designed a language to express such things, based on a simple pattern-matching for back-compatibility (sorry, AIML IS indeed a good idea, but lacks advanced specs and flexibility) tied together with a run time engine, basically in Spanish but I am adding some English capabilities too.

By making critics, I’ve pointed out the things (I think) to care about the most, and what are the common things among all AIML bot technology (even twisted ones). I stated that the actual model is weak to deal with those stuff, specially with goals, rich themes and reasoning, that we should seek for more extensive conversational standards to overcome those limitations. I saw AIML has recently got a ‘category’ flag, something that is actually my central beam of reasoning. Also you need to access a linguistic infrastructure to deal with anaphoric memory and factious memory, etc.

Those things are the ones I am working in.

For example: I recently built an automatic DAML shallow-tagger (¿have you heard of any other? pleas let me know) so I can tag a phrase to see if it’s emotional, sad, happy, a question, an answer or statement, nonsenses or a mixture of some of them, there are 42 flags to get out. Initially it works for Spanish, there you can see a lot of interjections, too many of them, also lots of ambiguities, so you have to do the things right to get something useful out of there. I am making this as good as I can. this tagger will deal with the user input/responses to decide whether to get an answer, or to analyze it as a response, a hollow word or a emotional word, like “follow-on”“ok” “yeah” .. etc. this will make the difference in conversational turns. The I will train an IA algorithm to take the ‘best’ decision on every next step.
I am now looking forward to get a DAML tagged conversation (in Spanish) to train/test my stuff.
If anyone hears on something, please let me know.

best regards.

Andres

 

 
  [ # 11 ]

Hmm… this to me is like criticising the International Space Station because it can’t go to Mars. It is easy to find fault with something instead of creating something better.

I thought on another thread you said you had a bot which had over 8 million people talking to it?

 

 
  [ # 12 ]
Steve Worswick - May 3, 2011:

Hmm… this to me is like criticising the International Space Station because it can’t go to Mars. It is easy to find fault with something instead of creating something better.

I did not see a contest on how to put a Space Station into Earth or Mars orbit, but I indeed did see many contests (ACM, etc.) on how to disambiguate text, how to answer queries, how to.. too many NLP things that most of the bots should actually do.
So I still think that specificity and good measuring tips, are the key to a quality contest, and to to make a funny human cheating chatterbot is not comparable to launching a Shuttle to Mars.!! wink And indeed I am creating something better, I’ll let you all know ASAP.

Steve Worswick - May 3, 2011:

I thought on another thread you said you had a bot which had over 8 million people talking to it?

Yes its true but I’ve probably said he had 8 million answers/talks/turns in some few months, not 8 million unique people as guests, may be each conversation is 10-20 turns, so the number of ‘unique’ people is 10/20 times less, we had no measure because the sessions were not counted individually for each user. The bot worked for 8 months with my engine, after this they migrated it into another engine, due to a commercial miss-agreement. This bots (there were 3 of them) were built based on my preliminary bot-engine, about 2 years ago, and they did not use even 1% of the actual power, it simply didn’t exist then.
The traffic on the first bot, was so high, about 10 to 15 hits/turns per second was the peak rate! that the server was overloaded many times, needing intervention due to memory limitations (sadly, no prevision was made to have more than 500 open sessions at one time).
The other bot, was put online on 14th December 2009, was called Yorugua, he was also an MSN Agent, he actually met 19k ‘unique’ people in about one month, each chat-session had a mean of 15.5 turns. The traffic was also rather high, but did not met the first ‘cook’ one.
There was also another chatbot called “Somos Messenger Siempre” built for Microsoft to teach people on how to use MSN with the cellphone, he was put online on January 2010.
The actual unique user-count was not recorded because it was used on text messaging (SMS) by an external interface, and the actual measures were kept/done by the phone-carrier people, but they were much less ‘hits’ compared with the other two bots.

Actually I am seeking for a local-team (with good Spanish knowledge) to help me write several ‘demonstration’ bot-packages to make an extensive essay+debug on the engine’s different features, that’s the challenge.

Then probably I will put it online and you may test it. But to talk to it, people should know Spanish very well, because the main goal of it is to talk Spanish, and all the slang-variants and overcome the many specific spelling problems. It will mainly be targeted towards the Latin-American market, this challenge may be a lot more difficult than to face English, but never reaching the skills and resources needed to put a space station on Mars’s orbit. wink

best regards.
Andrés

 

 
  login or register to react