AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

BIG MONEY CONTEST PROPOSAL
 
 

Some of you may remember that some time ago I tried organizing a contest which had rules that were more in keeping with some of the suggestions that many of you have expressed (Ok complained would be a better word). Out of the entire community, we only had 2 contestants show up. I’m not complaining, (Ok yes I am LOL), but it is a lot of work for that sort of turnout. Most of the complaints I received centered on the lack of a cash prize, you mercenary **********. wink Here’s what I am proposing. A “big money contest” which would feature an expanded set of rules, based on your suggestions. Because of the wave of interest due to Eugene Goostmans passing of the Turing test (Or not passing depending on your view), I believe that we can get massive media exposure. Because of the media exposure, we can crowd fund a significant purse. I do not think 25k or better is out of reach. Everyone, including Eugene Goostman, Apple and SIRI, Microsoft and Cortana would be encouraged (shamed) into competing. it should not be anyone groups endeavor, but a community effort and I am not going to make the mistake of putting a lot of work into it solo again. If anyone answers here, great. If not consider the idea dropped. Please note, this is not a criticism of any past contests, either those that we have participated in, or ones that have done such fine work for decades such as the Loebner event.

Here are some suggestions;

Some of the features from the Bragging Rights contest, including questions being selected from a pool submitted by each contestant would be used to ENSURE that there could be no claim of bias,

If there is a mechanical problem that causes someone to miss a round, the round should be scratched and that person given 48 hours to get back online. The idea here is to test AI technology, not to see who has the most reliable internet connection.

Each round would have the contestant asked variations of a question by (3) different judges. This would be an additional step to see if the AI is actually thinking through an answer, or was the question groomed. An average of those scores would become the score for that question group.

Follow-ups encouraged.

(2) free style sections where the AI is expected to have an intelligent dialogue on whatever the bots persona is (as Steve put it in another contest, if your supposed to be from planet XYZ, then I would expect you to know what the weather is like on planet XYZ), and another freestyle on general knowledge and world events on this planet. We tried having the bot respond to s pre-selected topic, but it could be an open Turing test.

Defined fields of intelligence (math, personality, world knowledge, etc…) would carry through and bots scoring highly in that field would continue to compete in that area in ever increasing complexity. (This is harder to reconcile with the “pool of questions” idea, but maybe the pool could be pre-sorted by a committee as to complexity, then randomly selected. So in essence you would have winners in each category, and every bot would be guaranteed a chance to make their best showing.

A final head to head for overall performance.

A peoples choice category where there is some sort of online voting.

Media coverage. It would be great to stream something like this live, unfortunately chatbot contests are right up there with live grass growing contests (where we stream the grass growing…live) as far as being able to hold the public interest. What would be cool, is providing a universal voicefacial interface for every bot (that does not have one) and streaming that live. Not hard to do really. That would make it more like robot wars, instead of robot snores. The fact that they actually posted a video on youtube of individuals typing in front of an LCD screen (which was unreadable) for coverage on the Goostman win, was probably the “geekiest” thing I have ever seen. And that’s coming from an “uber geek” LOL

Problems:

Higher dollar amounts make it more tempting to cheat. I don’t think you can disqualify bots that are only available over the internet. (Personal case in point, I would love to have competed in the Loebner event this year, the idea of taking our Intellectual property and putting it on a disc and shipping it to be pre-judged, is not something I can even propose to the board of directors.) So, maybe some sort of requirement that if someone steps forward out of nowhere and competes with something that looks 2 good to be true (a human confederate masquerading as a bot), some sort of onsite verification? (Depends on the money raised, its practical if you raise 50k+.  I don’t know. I’m sure others will see additional problems, but nothing that cant be worked out.


Those are some suggestions, but it would take having individuals from this community stepping forward to form a committee of some sorts to pull it off. The timing is critical, it needs to happen now due to the wave of interest due to the Eugene Goostman event, three months from now no one will care. So, if you are interested in competing in such an event, or interested in actively participating in organizing such an event, respond to this thread. I’ll start keeping an unofficial list of people, and when (or if) there are enough people who are interested in working on it, I’ll contact them and we’ll set up some lines of communication.


Vincent L Gilbert

 

 
  [ # 1 ]

Count me in for either taking part or being on the committee. I would be wary of people doing both, as if a bot wins that is anything to do with the contest organisers, people are bound to suspect foul play.

 

 
  [ # 2 ]

Good point Steve.  Ok, the list is started, who else?

Vince

 

 
  [ # 3 ]

Although the prospect is exciting, by the time the logistics are done, I think the time might be past.

Universal Voice/Face - Although a static animation is easy (a number of the contest have gone that approach to present the results of judging), doing something live is problematic and would require development on each bot. I did a bot to bot interface a while ago but some botmasters ran into TOS issues with their web hosts.
Intelligent agent test - http://www.skynet-ai.com/test/agent.htm

Problems:
Judge selection - some of us remember bad judges that almost killed a competition and brought it to a halt mid round
Questions - Question variation isn’t as big a deal as ensuring each bot is asked identical questions. Generation of questions is often considered problematic.
Scoring - Scoring conversations is notoriously difficult. It takes high quality judging.
Overhead - Contest management is time consuming. (As Steve can attest).

 

 
  [ # 4 ]

I like the idea of a multi-facet contest, but on my schedule it says Loebner Prize deadline in August, Alan Turing movie + Loebner Prize in November. I might participate if the event is well dressed, with animated avatars, text-to-speech, people talking to the avatars and hidden typists entering the questions. As opposed to watching some people sit in a chair behind a monitor. We discussed similar things about the Ted Talk Turing Test.

What is happening is that the public is kept in the dark about what transpired with Eugene Goostman, and have only the common notion that it was nothing to write home about. If you were able to tell the public that you would challenge the acclaim of Eugene Goostman in public alongside other AI, you’ve got a show. That’s what the people want.

I must say the thought has crossed my mind to organise a better AI contest if I had the money for it, but as Steve says, I would have to choose between organising and participating, and only the latter would help my own AI project get along. If you need art for promotion though, I hang out with a lot of illustrators for hire wink

 

 
  [ # 5 ]

Being the Undesputed World Champion of the first and only “Bragging Rights” competition (am I doing that ‘bragging rights’ thing right?)... here are my 2 cents:

1.) IMHO- Overall interest is limited to chatbot enthusiasts- not a large group.

2.) Apple/Siri MS/Cortana Google/Google all have so much vested interest in their respective products that they would surely never get their lawyers to agree to any single format of anything.

3.) That leaves number 1. above- a relatively small group indeed (I think ~50 is the number I heard a few years ago, not including clone owners- ones with chatbots using unmodified/minimally modified “mind files” with vanilla AIML or Chatscript interpreters, for instance).

4.) Crowd-sourcing (Kickstarter?) the prize is an interesting idea to get a reality check if nothing else.  Since there is no “product” being offered, only a spectacle, the spectacle part would have to be compelling.

5.) Thus, to work (via crowd-sourcing), it would have to be a compelling spectacle, like was mentioned, a “Robotwars” (as in the mechanical robot death matches) type of thing. 

Additional thoughts:

- Live streaming sounds a bit much because, even with the current state of the art, I think it would be very difficult to hold a casual observers’ interest past the first couple of non sequitur bot responses, so live stream is just not worth the extra effort(?).

- Part of the Robowars attraction is in profiling the robot combatants makers/operators, or in the present case, the bot developers- but how to transfer that from mechanical control of battling robotic death machines to conversations… maybe something like an animated Avatar (bot and/or developer(s)) demise might work (the Avatar Protection Act be damned!).  That could hold interest in at least seeing the weaker bots getting (their Avatars virtually) destroyed.

 

 
  [ # 6 ]
Vincent Gilbert - Jun 16, 2014:

Some of you may remember that some time ago I tried organizing a contest which had rules that were more in keeping with some of the suggestions that many of you have expressed (Ok complained would be a better word). Out of the entire community, we only had 2 contestants show up. I’m not complaining, (Ok yes I am LOL), but it is a lot of work for that sort of turnout. Most of the complaints I received centered on the lack of a cash prize, you mercenary **********. wink Here’s what I am proposing. A “big money contest” which would feature an expanded set of rules, based on your suggestions. Because of the wave of interest due to Eugene Goostmans passing of the Turing test (Or not passing depending on your view), I believe that we can get massive media exposure. Because of the media exposure, we can crowd fund a significant purse. I do not think 25k or better is out of reach. Everyone, including Eugene Goostman, Apple and SIRI, Microsoft and Cortana would be encouraged (shamed) into competing. it should not be anyone groups endeavor, but a community effort and I am not going to make the mistake of putting a lot of work into it solo again. If anyone answers here, great. If not consider the idea dropped. Please note, this is not a criticism of any past contests, either those that we have participated in, or ones that have done such fine work for decades such as the Loebner event.

Here are some suggestions;

Some of the features from the Bragging Rights contest, including questions being selected from a pool submitted by each contestant would be used to ENSURE that there could be no claim of bias,

If there is a mechanical problem that causes someone to miss a round, the round should be scratched and that person given 48 hours to get back online. The idea here is to test AI technology, not to see who has the most reliable internet connection.

Each round would have the contestant asked variations of a question by (3) different judges. This would be an additional step to see if the AI is actually thinking through an answer, or was the question groomed. An average of those scores would become the score for that question group.

Follow-ups encouraged.

(2) free style sections where the AI is expected to have an intelligent dialogue on whatever the bots persona is (as Steve put it in another contest, if your supposed to be from planet XYZ, then I would expect you to know what the weather is like on planet XYZ), and another freestyle on general knowledge and world events on this planet. We tried having the bot respond to s pre-selected topic, but it could be an open Turing test.

Defined fields of intelligence (math, personality, world knowledge, etc…) would carry through and bots scoring highly in that field would continue to compete in that area in ever increasing complexity. (This is harder to reconcile with the “pool of questions” idea, but maybe the pool could be pre-sorted by a committee as to complexity, then randomly selected. So in essence you would have winners in each category, and every bot would be guaranteed a chance to make their best showing.

A final head to head for overall performance.

A peoples choice category where there is some sort of online voting.

Media coverage. It would be great to stream something like this live, unfortunately chatbot contests are right up there with live grass growing contests (where we stream the grass growing…live) as far as being able to hold the public interest. What would be cool, is providing a universal voicefacial interface for every bot (that does not have one) and streaming that live. Not hard to do really. That would make it more like robot wars, instead of robot snores. The fact that they actually posted a video on youtube of individuals typing in front of an LCD screen (which was unreadable) for coverage on the Goostman win, was probably the “geekiest” thing I have ever seen. And that’s coming from an “uber geek” LOL

Problems:

Higher dollar amounts make it more tempting to cheat. I don’t think you can disqualify bots that are only available over the internet. (Personal case in point, I would love to have competed in the Loebner event this year, the idea of taking our Intellectual property and putting it on a disc and shipping it to be pre-judged, is not something I can even propose to the board of directors.) So, maybe some sort of requirement that if someone steps forward out of nowhere and competes with something that looks 2 good to be true (a human confederate masquerading as a bot), some sort of onsite verification? (Depends on the money raised, its practical if you raise 50k+.  I don’t know. I’m sure others will see additional problems, but nothing that cant be worked out.


Those are some suggestions, but it would take having individuals from this community stepping forward to form a committee of some sorts to pull it off. The timing is critical, it needs to happen now due to the wave of interest due to the Eugene Goostman event, three months from now no one will care. So, if you are interested in competing in such an event, or interested in actively participating in organizing such an event, respond to this thread. I’ll start keeping an unofficial list of people, and when (or if) there are enough people who are interested in working on it, I’ll contact them and we’ll set up some lines of communication.


Vincent L Gilbert


I would be interested in taking part etc.

Dan

 

 
  [ # 7 ]

@Dan   consider yourself added to the list of interested parties!

@Merlin

Animation integration. Since we aren’t talking about needing to have the audience right there with the judge as the questions are asked, it would probably not be as difficult as actually integrating an animated avatar with the bot itself.  Since the logistics of a contest itself prohibit you from “going live” in the first place (There needs to be latency between all the bots having been tested and the broadcast) what you would be talking about is animating an “instant replay” Probably the simplest approach would be for the transcripts for a completed round to be entered into an application which would generate an animated version for broadcast. You could use something like VLC to capture the active window and convert it to media, which could then be webcast.  There are a number of applications now that automate the process of converting your static avatar into an a series of base viseme and other animations.  It would probably take ma about a day to create a windows app along those lines, so not too bad.

Questions- How we worked it was to have every contestant submit a list of questions, then an application randomly selected from the list. The idea was that if you have 10 entries, and 1 question is selected from each entry, you get 10 questions that everyone knows weren’t biased. Each contestant does now know what the other 9 were, but as long as each contestant can say, yep one of my questions is on the list, then there was no foul play.  Per mutating them to get similar but different questions to form a question group could also be automated and be as simple as adjective substitution, or number value substitution for math questions.  The bots would be asked the same questions, just three variations of the same questions by three different judges. There has been some comments in various threads about whether or not a bot is actually parsing variables, or regurgitating a string. This would solve that to a degree.

Judge quality- negated during the standard question section. As for the freestyle, I would suggest using individuals from the tech blog sector rather than people from the AI fraternity itself. It is more representative of an actual conversation, you have someone who has some basic tech knowledge, but is less likely to simply hurl “out of bounds” statements at the bot, and again you would have three judges. Again just suggestions.

Time overhead - many hands make light work wink

@Carl B

The winner and recipient of the coveted Bragging Rights “genuine imitation gold colored digitally compressed medallion shaped prize object” wink

Have to disagree about the interest. Movies like “her” and “Ironman” have the publics attention as the lethal increase in traffic after the Eugene Goostman announcement showed.

Regarding participation by major vendors. Dual edged sword there, not showing could also be a problem, one thing I agree with though, there would be a shitload of attorneys involved. Have to see what would happen. You could always offer something like the “coveted yellow chicken” award for those invited who do not show up. Because…..you can never have too many people suing your ass off at one time. wink

“he spectacle part would have to be compelling.” That my prize winning friend is the whole concept in a nutshell. Honestly that sums up a great portion of human nature.

Live streaming…could be right, although webcasting isnt really problematic or costly. But a straight on demand feed available would probably be better.

Now that last paragraph is sheer genius Carl. I officially nominate you as chairperson for the “Mayhem portrayal development” sub committee. That is exactly what would get and hold public attention.  Maybe “live” was a bad choice of words, as no matter what it isnt live meaning “the judge types the audience views”, but live as the exchange is recreated. Using the model above, you could merge the 2 contestants responses into a single transcript, then animate the whole thing similar to a JibJab animation.  Complete with the animated demise of the contestant. In fact, I bet you could get the JibJab folks interested in participating.

I now have this image of SAIL losing and getting eaten by animated lions, and if history has taught us anything, that always makes for good spectacle! Seriously that is exactly what would make it go viral. Great idea Carl.

Steve and Don - The idea we put forth for the Bragging Rights contest was that it would take place every 2 weeks for a period of time, that would allow someone to sit on the committee or judge for one contest, and participate in the next one. If the type of event that is sort of emerging from Carls idea actually happened, I actually think it could go viral and warrant a 2 week recurring schedule. Maybe something like that would work?

Vince

 

 

 
  [ # 8 ]

Or the judges would have to be people completely unrelated to the interested parties. e.g. popular vote by general public, or rather, the Kickstarter pledgers.

The bots would be asked the same questions, just three variations of the same questions by three different judges. There has been some comments in various threads about whether or not a bot is actually parsing variables, or regurgitating a string. This would solve that to a degree.

Sounds like a good idea, but I’m not sure, I’ll give my two cents: If only the phrasing were to vary, both narrow-scope patterns and advanced NLP would result in the same answer. If only the subjects varied, they would have to vary semantically to make a template response become obvious. Otherwise, whether you ask my NLP/knowledgebase program “What is the name of your cat?” or “What name does your dog have?”, the program performs the same search routine for the name of whichever subject, and phrases the answer the same, which I couldn’t distinguish from a template response myself.
Unless we’re going full Winograd Schemas smile

 

 
  [ # 9 ]

Heh. Learn something new every day. Thanks for the tip Don, that will be very useful for testing my software.

http://www.cs.nyu.edu/davise/papers/WS.html

A Winograd schema is a pair of sentences that differ in only one or two words and that contain an ambiguity that is resolved in opposite ways in the two sentences and requires the use of world knowledge and reasoning for its resolution. The schema takes its name from a well-known example by Terry Winograd (1972)

The city councilmen refused the demonstrators a permit because they [feared/advocated] violence.

If the word is ``feared’‘, then ``they’’ presumably refers to the city council; if it is ``advocated’’ then ``they’’ presumably refers to the demonstrators.

 

 

 
  [ # 10 ]
Daniel Burke - Jun 17, 2014:
Vincent Gilbert - Jun 16, 2014:

Some of you may remember that some time ago I tried organizing a contest which had rules that were more in keeping with some of the suggestions that many of you have expressed (Ok complained would be a better word). Out of the entire community, we only had 2 contestants show up. I’m not complaining, (Ok yes I am LOL), but it is a lot of work for that sort of turnout. Most of the complaints I received centered on the lack of a cash prize, you mercenary **********. wink Here’s what I am proposing. A “big money contest” which would feature an expanded set of rules, based on your suggestions. Because of the wave of interest due to Eugene Goostmans passing of the Turing test (Or not passing depending on your view), I believe that we can get massive media exposure. Because of the media exposure, we can crowd fund a significant purse. I do not think 25k or better is out of reach. Everyone, including Eugene Goostman, Apple and SIRI, Microsoft and Cortana would be encouraged (shamed) into competing. it should not be anyone groups endeavor, but a community effort and I am not going to make the mistake of putting a lot of work into it solo again. If anyone answers here, great. If not consider the idea dropped. Please note, this is not a criticism of any past contests, either those that we have participated in, or ones that have done such fine work for decades such as the Loebner event.

Here are some suggestions;

Some of the features from the Bragging Rights contest, including questions being selected from a pool submitted by each contestant would be used to ENSURE that there could be no claim of bias,

If there is a mechanical problem that causes someone to miss a round, the round should be scratched and that person given 48 hours to get back online. The idea here is to test AI technology, not to see who has the most reliable internet connection.

Each round would have the contestant asked variations of a question by (3) different judges. This would be an additional step to see if the AI is actually thinking through an answer, or was the question groomed. An average of those scores would become the score for that question group.

Follow-ups encouraged.

(2) free style sections where the AI is expected to have an intelligent dialogue on whatever the bots persona is (as Steve put it in another contest, if your supposed to be from planet XYZ, then I would expect you to know what the weather is like on planet XYZ), and another freestyle on general knowledge and world events on this planet. We tried having the bot respond to s pre-selected topic, but it could be an open Turing test.

Defined fields of intelligence (math, personality, world knowledge, etc…) would carry through and bots scoring highly in that field would continue to compete in that area in ever increasing complexity. (This is harder to reconcile with the “pool of questions” idea, but maybe the pool could be pre-sorted by a committee as to complexity, then randomly selected. So in essence you would have winners in each category, and every bot would be guaranteed a chance to make their best showing.

A final head to head for overall performance.

A peoples choice category where there is some sort of online voting.

Media coverage. It would be great to stream something like this live, unfortunately chatbot contests are right up there with live grass growing contests (where we stream the grass growing…live) as far as being able to hold the public interest. What would be cool, is providing a universal voicefacial interface for every bot (that does not have one) and streaming that live. Not hard to do really. That would make it more like robot wars, instead of robot snores. The fact that they actually posted a video on youtube of individuals typing in front of an LCD screen (which was unreadable) for coverage on the Goostman win, was probably the “geekiest” thing I have ever seen. And that’s coming from an “uber geek” LOL

Problems:

Higher dollar amounts make it more tempting to cheat. I don’t think you can disqualify bots that are only available over the internet. (Personal case in point, I would love to have competed in the Loebner event this year, the idea of taking our Intellectual property and putting it on a disc and shipping it to be pre-judged, is not something I can even propose to the board of directors.) So, maybe some sort of requirement that if someone steps forward out of nowhere and competes with something that looks 2 good to be true (a human confederate masquerading as a bot), some sort of onsite verification? (Depends on the money raised, its practical if you raise 50k+.  I don’t know. I’m sure others will see additional problems, but nothing that cant be worked out.


Those are some suggestions, but it would take having individuals from this community stepping forward to form a committee of some sorts to pull it off. The timing is critical, it needs to happen now due to the wave of interest due to the Eugene Goostman event, three months from now no one will care. So, if you are interested in competing in such an event, or interested in actively participating in organizing such an event, respond to this thread. I’ll start keeping an unofficial list of people, and when (or if) there are enough people who are interested in working on it, I’ll contact them and we’ll set up some lines of communication.


Vincent L Gilbert


I would be interested in taking part etc.

Thank you.

 

 
  login or register to react