AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Contest questions and a bi-weekly showcase
 
 

This is a continuation of part of the Loebner thread that might begin to get off topic so I started a new thread. Brian makes a good point, whreas contests tend to ask all the bots the same questions on order to establish a baseline, is that really the best method? Izar is an Alien, Blydgesmith is a dragon, Skynet AI is a (maniacal? LOL) AI bent on world domination. Most bots have an area of specialization (like people) that they were created to represent. One thing I noticed from going over the Chatbot Battles transcripts was the fact that Steve tended to converse with the bot using knowledge that the bot SHOULD be able to converse on given what the bot was created to represent.  I mentioned somewhere else that we have been toying with the idea of hosting a bi-weekly “not a contest” contest and I have been distilling ideas and this was definitely one of them.  Some of the others are;

[1] each entrant supplies 10 questions for each “match”. An application randomly picks (1) one question from each botmaster and these randomly selected questions become the stetement\interrog set. So if there are 12 entrants, that match will have 12 statement\interrog. This ensures that there can be no question as to how the statement\interrogs were arrived at, but it also ensures that each bot will more than likely “shine” on at least one question.

[2] each bot will be expected to “freestyle” for five minutes. The “judge” will base the conversation on the bots persona. For instance, if the bot is an alien, it should be able to talk about its home planet rather than “who is the President of the United States”.

[3] Each of these (2) types of interaction will be judged in different categories rather than a simple “best answer”

For instance

Judge: What time is it”
Bot: Its 12:30AM

is a good example of a direct\logical response

Judge: What time is it”
Bot: Time for you to buy a watch

doesnt answer the question but is still (after all these years) funny and might better represent a particular bots expected response.

and if your an alien

Judge: What time is it?
Bot: Its ~ÓˆlXl‹¢¢åä}Ï. What? You dont read Venusian?

Might actually be particularly brilliant.

Some possible categories are
1 Best logic
2 Funniest
3 Most original
4 Best representating the bots persona
5 Peoples choice

[4] Judges scores will be factored in with public opinion polls. For instance, a 50\50 split where the judges score represents half of the total and public opinion polls representing the other half. 

these are just some of the ideas that we have been throwing around, whether or not it actually somes off will depend on how much of it I can automate (like the creation of the public polls etc…), through facebook etc.. Bi-Weekly might turn out to be a bit overly ambitious, but I do think that part of what fuels the frustration that is sometimes expressed with contests is how infrequently they occur. thats shy Im calling this a “not a contest” contest, its really more of a “showcase”. A way for anyone, regardless of their level of expertise or their committment to reaching the “singularity” a way to show off their creation without the pressure of a yearly wait before getting another shot. The more serious contests will always be there for that.

the tentative site is here http://ai.r-i-software.com/bragging_rights

and I think that I mentioned that the “giger inspired” alien\cyborg\ girl that is staring in horror at the “mad scientist” is my oldest daughter grin (Ive been dying to use that photo for a long time now LOL) 

Plsease jump in here with ideas.

Vince

 

 
  [ # 1 ]

We need to be careful that the questions supplied by each entrant are reasoanable in that the other bots have a fighting chance of answering it. A devious botmaster could hardcode something like, “Take the intial letters of Train, Apple and Card, reverse them and tell me what noise that animal would make?” into his own bot and nobiody else would have a chance.

 

 
  [ # 2 ]

LOL I think that was actually a question at one point wasnt it?
Thats true, and I thought about to handle that type of situation. Since its being held fairly often, theres more of an opportunity to identify someone who consistently submitted “out of bounds” questions. We could try having an impartial panel to review complaints or something of that nature, and since only one out of ten questions submiited per person will be used, if someone is submitting ten out of ten that are “out of bounds” it should be fairly obvious.  IN the end I think that the solution is probably built into the “category” format. Since everyone (pretty much) has some type of default handler ( a trigger that fires when all else fails), someone could submit that type of question with a hard coded “direct” answer win on “logic”, and still lose on overall. 

for instance in the example you gave

Judge: “Take the intial letters of Train, Apple and Card, reverse them and tell me what noise that animal would make?”
Bot: Meow

would score high on logic
whereas a default

Judge: “Take the intial letters of Train, Apple and Card, reverse them and tell me what noise that animal would make?”
Bot: Wow! What do you do when you arent watching Star Trek, playing World of Warcraft, and thinking of questions like that?

might score higher on humor, originality, persona and might be up there in the peoples choice category.
(thats pretty close to an actual RICH default, and it should be noted that RICHs “anti geek” caustic wit is pretty much me poking fun at myself through my alter ego wink

and if there are twenty entrants, even if there are a couple of botmasters that are really abusing the system, it would only represent 2 out twenty questions (that they could still lose) and it wouldnt help in the freestyle. All in all I dont think it would help in the overall “king of the hill” standings

Vince

 

 

 
  [ # 3 ]

Something else that just occurred to me;

[1] Since the bots will be asked to converse on their topic of shoice fairly regularly, similar statement\interogs should be able to produce different responses

[2] Maybe when the transcripts for a given round are published, a random topic is also published that the bot will be asked to freestyle on in the next round. Just to keep it interesting from round to round

Vince

 

 
  [ # 4 ]

Another way to handle the submitted questions would be that no points will be rewarded to the bot of the submitter of that question.

Or I remember the game Quiz As Quiz Can, where every group submits questions to be answered by the other ones, and gets the more points for a question the later in the round the correct answer will be named, but no points if no other group answers it correctly. (The order the other groups are asked in is determined by random.)

 

 
  [ # 5 ]

What I think is missing from contests is the follow-up question.

If I had to pick a word, I’d say that I’m “surprised” at how little emphasis is given to the conversational ability of chatbots, rather than the question answer method, and lists of largely irrelevant and obscure questions at that.

It might be easier in a contest such as the Loebner… but, could the quiz, or the conversation, consist of two parts… or perhaps award a half point (or a range of points) to the response question, and another half point to the answer to the follow up question?  While the initial question could be direct and specific…

What time is it?

... the follow-up could be less specific but related…

What do you normally do this time of day? or What time did you wake up today?

I’m reminded of how politicians will hold a news conference, but limit the reporters to a single question, making it easier to brush off difficult questions with a generic response, and move on.

I most often see visitors express shock and approval when bots seem to be following the conversation, retain information to be called upon later (My name is *. What is my name?) and supplemental questions related to the last response.

Do you have any hobbies?
I enjoy horseback riding.
When did you first discover this?
Well, I was created on Friday, the 1st of January, 2013. So, it was probably sometime around then.

Answers that are blatant diversions, or related, but not exactly correct, could be awarded points accordingly.

 

 
  [ # 6 ]

Thanks for the input guys, and many thanks to Dave and the crew for getting chatbots.org back up!

@Peter Thats a good suggestion and depending on how things go we might have to go to a format like that. The idea behind the format as it stands was that it is really more of a showcase than a serious “contest”.  This was part of the idea behind botmaster supplied questions. Every bot no matter what level of development is guaranteed that it can shine on at least one question. The idea in having a showcase as much as a serious “contest” was to encourage botmasters, regardless of their skill level or where their bot is in ots development, or whether it might have a shot at placing in a more serious “Loebner” like event, to participate. The fact is that one of these projects requires a tremendous amount of time and effort and its nice to be able to show that work off I think. The other problem that would be addressed doing it this way is something that has been mentioned in other threads here, that being “what is a fair question” and “who picks the questions” etc…

@ Thunder
Also a good suggestion, and I dont see any reason why the framework as it is up today wouldnt support it if a particular entrant wanted to submit a multi-part question as one of their submissions, or even submit all multi-part statement\interogs . One of the reasons that I used a single textfield for the entry form was to allow for a lot of latitude in entries and submissions. Also the inclusion of the (2) freestyle sections are based on a bots ability to carry on a conversation, on a topic that they should be able to talk about (based on their purported persona) and a random topic which is selected by the outgoing judging panel.

As I mentioned the idea that has emerged isnt meant to be a replacement for more serious contests like the CBC or Loebner events or Donalds league battle once he gets it running, its more of a loosely frameworked once or twice a month event where everyone can showcase what theyve done which can run concurrently with more serious contests. Rather than it being “this person” or “this groups” event, it would be more a community run project. Every event would have a different head judge and judges from within the community. If anyone is unhappy either as a contestant or a judge, then next week (or 2 weeks) there is a different group sort of running the show and it could be you. Someone might want to set up a better calendering system during their time as head judge, etc…(It was also a clever way for me to avoid shouldering the enormous amount of work that hosting one of these things must take on a bi-weekly basis. wink Anyone who wants to be part of the judging pool (just a list of interested people who would like a shot at judging on a rotating basis) please post here or email me.

I think that this framework sort of smoothes out the question of “what makes a good chatbot” as well and may encourage a broader range of bots to participate. Doesnt matter what you feel your creation is good at, or what you feel that it might not be good at, using categories means that you have a shot at each question, plus theres always the public opinion polls.

Anyway the more complete description is up at http://ai.r-i-software.com/BRAGGING_RIGHTS/

Vince

 

 
  [ # 7 ]

Of course, the idea of a contest that encourages botmasters of not so sophisticated bots is something we’re missing so far. Under this aspect, your suggestion, Vince, seems to be the best way to start this.

I’m looking forward to a wide range of chatbots participating!

 

 
  [ # 8 ]

Oops!

LOL well it just goes to whow that no matter how thoroughly you think things through you cant think of everything. I was checking the contest email to see of there was any interest in signing up, and in fact there was…..and there I was staring at the contestants ten questions. Therefore I have removed RICH from the list of bots that will be participating in this first round (when and if we get enough people signed up) and I will be acting as a judge. Ive made a note on the judges fact sheet that the first thing that an incoming head judge does when setting up the next round will be to change the password on the contest email to prevent this from happening in the future.

Still looking for entries and people who wish to be added to the judges pool.

Vince

 

 
  login or register to react