AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

A quick word about the entries in this year’s Loebner Prize
 
 
  [ # 16 ]

That was very kind of you Steve to assist a fellow competitor even though you weren’t obligated to do so. It speaks volumes about you as a person! Very good!!

I just learned that your bot, Mitsuku has won the 2019 Loebner Prize!! Congratulations!!

 

 
  [ # 17 ]

Thanks Art. It’s nice to win the contest but it’s better to win it fairly when the other entrants are performing to the best of their abilities. I’ll post more details about the event when I get a moment.

 

 
  [ # 18 ]

A quick message for Will Rayer.

Hope you got back to Jersey safely. Uberbot finished 2nd in both the Loebner Prize and the award for best overall chatbot. Congratulations.

 

 
  [ # 19 ]

thanks for everything steve. its unfortunate that noone was able to talk to my chatbot. but it wouldn’t have been fair to replace the files in the middle of the competition. im glad you won. congrats on your fifth win.

 

 
  [ # 20 ]

Your bot still worked. Lots of people talked to it. It was just a bit annoying for the room if someone accidentally closed it.

 

 
  [ # 21 ]

thanks for letting me know. it was nice to be part of the competition. congrats to steve for winning.

 

 
  [ # 22 ]

brian the robot scored 0 points in both categories. this is what that looks like on a typical day at pandorabots. these are from 3 different users. unfortunately, because of the errors that showed during the contest. I have no idea what was entered in the competition. unfortunately without a contest it’s impossible to show that the robot is real. hopefully it will still be up at pandorabots after i post these photos.

 

Image Attachments
3b.png1.png3a.png
 

 
  [ # 23 ]

for some reason the photos didn’t upload but here is a typical conversation with the robot recorded in a youtube video. the usernames are obscured. however the video still shows the robot talking with a user. https://youtu.be/Mf9McB1WFqA

 

 
  [ # 24 ]

I composed a clearer scoreboard (In no particular order for equal scores).

I can’t tell what the scoring system was though (points out of 100?), and it might be a fairer representation to know which ones were out of order due technical difficulties rather than quality.
Mitsuku seems to blow everyone else out of the water, sooooooo… congrats. Monotonous outcome, but deserved.

Image Attachments
loebner2019_scoreboard.png
 

 
  [ # 25 ]

Congrats Steve.
Thanks for the scoreboard Don.

As I understand the scoring, each user votes for their favorite bot. I would assume 1 vote per person.

As I looked at my logs though, I can’t find more than about 20 users that ever tried Skynet-AI (including kids and the guy that turned on the machine). Hard to beat Steve’s 24 votes when you don’t even have that many people who tried the system.

It would be interesting if others logs show something similar.

 

 
  [ # 26 ]

A limited amount of votes per person would account for the lopsided results. But like you, I only had about 20 users in my logs. Steve did mention to me that attendance was, er… low, to put it mildly. On Friday there were only 4 visitors and I don’t think the weekend was that much better. 200 school children on Thursday though.
I suppose we ought to ask the AISB for details, but I’m too lazy right now.

By the way Brian, I’d say that your chatbot being on Pandorabots already proves that it is real, and I for one do not question that it works as you say it does.

 

 
  [ # 27 ]

Day 1 - 240 school kids. Hardly any voted due to it being too busy
Day 2 - Business Day - 5 people turned up, 2 were from AISB
Day 3 - General public day - 8 people turned up, 6 were Jim Curran’s family
Day 4 - General public day - 8 people turned up, 2 were the organiser’s family. 6 was a family with only 2 adults. They didn’t bother voting.

The turnout was appalling. Hardly anyone turned up due to nobody knowing about the event.
Let me get back home and I’ll tell you more about it

 

 
  [ # 28 ]
Steve Worswick - Sep 16, 2019:

Day 1 - 240 school kids. Hardly any voted due to it being too busy
Day 2 - Business Day - 5 people turned up, 2 were from AISB
Day 3 - General public day - 8 people turned up, 6 were Jim Curran’s family
Day 4 - General public day - 8 people turned up, 2 were the organiser’s family. 6 was a family with only 2 adults. They didn’t bother voting.

So, only a dozen or so total strangers showed up over 4 days?!

I feel bad for those who took the time and effort to submit a bot.

Seems to be a dead “Contest” with the same predictable outcome (“Mitzuku Wins!”).

The esoteric entry requirements (wow, how many bots failed to even function again?!) certainly does not help.

IMHO, nobody but a handful of “botmasters” even pay any attention to this type of competition.

In fact, a Google search of “loebner prize 2019” brings up only 4 hits (!!!):
https://en.wikipedia.org › wiki › Loebner_Prize,
https://medium.com › pandorabots-blog › mitsuku-wins-loebner-prize,
https://www.chatbots.org › ai_zone, and
https://aisb.org.uk.

I assume the wikipedia entry done by someone from AISB, otherwise, there is only hits from AISB, Pandorabots, and Chatbots.org.

A News search (for the past 7 days) is even more depressing- only a single BBC article.

 

 

 
  [ # 29 ]

The thing that I was very disappointed and frustrated about was the lack of PR and marketing. Nobody knew about the event and yes, I would say 20 or less people visited and voted during the last 3 days. Even the facilities people at the university were unaware of the event as we turned up on the Saturday morning to find all the doors locked!

No media at all visited the event. There’s normally a reporter or a film crew. Also the university itself was quite a way out of Swansea city centre. You wouldn’t have passed there by chance but as there were no posters or banners in town, nobody knew about the event. I felt a little sorry for the owner of one of the entries called Mary who had flown all the way from Vietnam to be at the event. Another guy flew from the USA to be there.

Fortunately, instead of just sitting around, it was good for us to chat with each other to discuss bot techniques and issues we have. If it hadn’t been for the 6 botmasters who attended, we would have gone crazy.

All the bots functioned and worked but I could see that installing some of them was painful. Instructions like, “now download the latest git repository”, “you need to edit this batch file”, “install flash player at this point” or “you now need to grab a copy of Java” were not helpful. If you are planning on entering, at least do the organisers the courtesy of including any software you need. Any standalone bots were installed on Windows 7 laptops, the internet ones ran on Raspberry Pi devices. There was nothing unusual with the kit and the bots should have been tested before submitting them.

In contrast, Arckon was activated by simply clicking on Arckon.exe

Sorry to say that some of the bots were simply not up to the task of being publicly displayed. One needed grammatically correct sentences with punctuation marks at the end or it would freeze and need restarting. Some appeared to be just command prompts which allowed the visitors to have free range on the desktop. One of the bots that was internet based, had a website that was frequently offline during the first day but all of them worked fine during the final 3 days after the botmasters who attended, guided the public on how to use them.

The voting system worked by each person visiting each of the 17 bots and voting for the one they like best plus the one they liked second best. They also voted for the one that was most humanlike and the one that was second most humanlike. The top vote in each category on their paper scored 2 points for the bot. The second best scored 1 point. The scores were tallied to give the final result. I didn’t get 24 visitors, just 24 points made up of 2 points and 1 point scores. All the bots got an equal amount of testing from what I could see.

The downside to this method is that each visitor had to spend time testing each bot and even if they only spent 2 minutes with each bot, that equated to 2x17=34 minutes. Not many people seemed willing to spend that much time apart from the jury who spent around 5 minutes with each bot. A mammoth session for each person that lasted nearly an hour and a half. This system needs reviewing as no casual visitor will have that much time to spare.

Not quite sure what Carl meant by esoteric entry requirements though. I don’t see how it could be any easier than “send us a link to your chatbot” for any internet based ones and if it works offline, to send them a copy of your bot with reasonable instructions on how to install it, complete with any software it requires. The prohibitive LPP has been removed and anyone had free reign how they entered their work.

Yes, I won again but when visitors were asking things like, “What colour is a red ball” and the bots are replying, “You are!”, “white”, “I am the ancient king Hammurabi” and other such nonsense or even freezing as there was no question mark at the end then I make no apologies for winning yet again. Some bots were good but I’m sorry to say that many of them need a lot of work to compete at the highest level. If any individual wants to know how the public reacted to their bot or any comments they made, please post and I will be happy to share what I remember.

My final thought is that 4 days was far too long. The only worthwhile day was the school day where the children were having fun talking to the bots and playing with a hardware robot (Edbot). The business day was pointless (as this is a public forum, I don’t want to comment on the quality of these businesses - suffice to say that Dragons Den/Shark Tank won’t be interested) and the last 2 days could have fitted into 1 day. I hope the event continues but the marketing/PR team really need to step up their game to make this the success it deserves to be. I attend lots of events and conferences and the marketing starts months in advance in order to achieve the biggest turnout with regular email reminders, posts on social media, contacting the press/media etc. This needs to happen with the Loebner Prize or I have to agree that its days are numbered.

 

 
  [ # 30 ]

Thanks for those details, Steve. I’m not envious of how you spent the weekend.

Gonna be frank, that is a terrible score system, and it explains the disproportionate exponential curve in the scores. In principle, not only does the best chatbot stand to get twice more points than the second-best, but chatbots that are half-decent all get near zero points just because they’re not the two best. I suspect that with a proportionate voting system, scores would have looked more like in past years. I say it is high time the British adopted the metric system! vampire

I do believe the AISB has bitten off more than they can chew by organising this as a bigger event with an art exhibition. The Loebner Prize has always been obscure, but so is the Winograd Schema Challenge, and its second installment that never happened, and the Neukom Turing Tests for Creative Arts that even you may not remember existing. The pattern here is that AI professors are just not fit for promoting events, whether they lack the foresight, connections, or time. Two alternatives could be to host the event at an already established AI convention, or to move the entire competition online and invite internet communities to judge. These options are problematic for different reasons, but a greater turnout would do away with the judging issues. A visitor would not need to test all 17 candidates individually if there were enough other visitors to engage with them, like on the school’s day.

For reality’s sake I dare also say that the quality of chatbots, whether ours or Microsoft’s, falls short of the general public’s interest if ever you’ve read online opinions. It appears that the only sure way to attract attention is to claim something sensational about passing Turing tests.

The Loebner Prize has historically been an ungrateful guest to many that hosted it, and without constructive criticism our lamentations may be a self-fulfilling prophecy. I did like the notable difference in human vs best scores, and 20 visitors still beats choking on a qualifying round if you ask me, although the scores are nothing to look at.

 

 < 1 2 3 > 
2 of 3
 
  login or register to react