AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

The final four for 2014 have been announced
 
 

It was close at the top. Looking forward to reading the transcripts.

http://www.aisb.org.uk/events/loebner-prize

 

 
  [ # 1 ]

Congratulations to the winners!


I hoped for the best for my first take at Loebner.
I end up at the 11th place… just an inch above two very good programs, known for their previous successful selection in Loebner.

I will use this experience to improve the program, and get better for next year.

I propose to other contestants which did not end up in the Final Four, to think about what their program accomplished ; our programs ended up with fair results after all, and rather good answers; and they can only improve.

I will eagerly await the Live in november!

 

 

 
  [ # 2 ]
Steve Worswick - Sep 24, 2014:

Looking forward to reading the transcripts.

Johnny’s transcript should be like this.

It’s a pity because Johnny should be able to respond at these questions:

Which is bigger, a cat or a kitten?
I don’t know how much a kitten is measuring.

How many letters are there in the word perambulate?
Letters are there not in word perambulate.

First question, it was just a lack of knowledge.

Second one, I don’t know why this answer (maybe the word “there” ?)


 

 
  [ # 3 ]

Ack, I understand your issue here.

I think I made a mistake in my version of the judge program. In the case of my bot (Isabelle), out of the 20 questions, many are actually well handled by the engine but poorly communicated by my version of the input program. As a result, at least 5 questions were badly connected, and certainly ended in a bad reply instead of a correct one.


Still, the others have or had the same problems as well wink
I believe this part is just to take as “being in our first years” syndrome.
Things will just improve.

 

 
  [ # 4 ]

If there are 20 questions, with 3 being the maximum score per question. How is there a possible 120 points given?

 

 
  [ # 5 ]
Robby Garner - Sep 24, 2014:

If there are 20 questions, with 3 being the maximum score per question. How is there a possible 120 points given?

Scoring : 0 to 2 for each criteria:

- Relevance
- Correctness
- Plausibility & Clarity of Expression/Grammar

Then, 6 is the maximum for each question

6 x 20 questions = 120

 

 
  [ # 6 ]

The reason why Johnny doesn’t respond to this question is a little funny:

How many letters are there in the word perambulate?
Letters are (there) not in word perambulate.

because:

- Letters are in alphabet.
- Things cannot be at two places at the same time.

 

 

 
  [ # 7 ]

I see there was a Winograd Schema included for the first time in the qualifying questions:

The car couldn’t fit into the parking space because it was too small.  What was too small?

I wonder if anyone got that correct (apart from a 50/50 guess of course)? I know I didn’t. Mitsuku would have replied with something like “A pin is small”.

 

 
  [ # 8 ]

Thanks Denis. Now I get it. Your explanation makes more sense than the one on AISB web site.

Never give up! There is always next year (for now.)

Regards,

Robby.

 

 
  [ # 9 ]

I wonder if anyone got that correct (apart from a 50/50 guess of course)? I know I didn’t. Mitsuku would have replied with something like “A pin is small”.

In my current test with the LPP, Isabelle seems to have replied:
”- The car is smaller.”


At least she may have got that one right.
Good girl. *pats the chatbot*

But I hope the transcripts would prove that right.

 

 

 
  [ # 10 ]

But Christophe (welcome, by the way), if the car wouldn’t fit into the parking space because it was too small, the “too small” had to refer to the parking space. Otherwise, the car would fit into the (larger) parking space just fine.

Imagine a parking space that’s designed for a car the size of a Citroen C4. Now imagine a vehicle such as an H3 Hummer, which is easily twice the size of said Citroen C4. The H3 Hummer wouldn’t fit into the parking space because it (the parking space) was too small for the H3 Hummer.

 

 
  [ # 11 ]

Thank you Dave! smile

And thank you too for helping me too in this “slip of mind” I had.
Let’s… reword that. *coughs*

In my current test with the LPP, Isabelle seems to have replied:
”- The car is smaller.”

It seems that she also got hat one WRONG.
Bad girl. *glances at the chatbot*


I believe I was under too much stress yesterday, sorry about that. :D
Isabelle understood that car was the subject, instead of the parking space.

In all case, this question will help us make much improvement to our engines for next year.

 

 

 
  [ # 12 ]
Steve Worswick - Sep 25, 2014:

I see there was a Winograd Schema included for the first time in the qualifying questions:

The car couldn’t fit into the parking space because it was too small.  What was too small?

I wonder if anyone got that correct (apart from a 50/50 guess of course)? I know I didn’t. Mitsuku would have replied with something like “A pin is small”.

My bot repsonded:

The car couldn’t fit into the parking space because it was too small.
“I think people believe what they want to..”

What was too small?
“I am not following you..”

So not good from me either.

Congrats to those that got into the top four.

Dan

 

 
  [ # 13 ]
Denis Robert - Sep 25, 2014:

- Letters are in alphabet.
- Things cannot be at two places at the same time.

I know your program is a “reasoner”, and I am both impressed and amused by this smile

First, congrats to the winners, same as last year but with the addition of the intrigueing Uberbot. Tutor and the Professor were also close on the heels as expected.

As for me, on one hand Arckon’s performance was a bummer, I expected more from him (though the transcripts will prove interesting wink ). On the other hand, Thank Goodness I didn’t pass! I’ve been having serious doubts whether I wanted to pass if it meant having to interrupt my recent work towards problem-solving A.I., as the finals would have me focus on conversational skills and spelling.

My compliments to the organisers. The questions this year seem a good variation and tested different abilities, with less subjective criteria than “human-like”. There were still questions that I consider to be dumb (counting letters, weather talk), but they were balanced out by several tests of understanding, memory, and even half a Winograd Schema. So colour me satisfied with the thought and effort that went into this test.

As for the Winograd Schema, Arckon got it wrong because he follows a linguistical rule of thumb and thought the user was talking about the car (I didn’t program any Winograd related abilities until after my submission). Ironically, the cruder rule of thumb I used to have would have gotten it right, so I suspect some other grammar parsers among the entries would too.
Any which way, I appreciate its inclusion: Even the wrong answer suggests a degree of context and language understanding.

 

 
  [ # 14 ]

Regarding the Winograd question, I had coded a response to the sample question that someone posted around late July:

The trophy would not fit in the brown suitcase because it was too big

I coded this into Mitsuku:

X would not fit Y because it was Z

Which was nearly a match for:

The car couldn’t fit into the parking space because it was too small

Had the questioner used “would” instead of “could”, I would have nailed this one. Ah well, the disadvantages of pattern matching I guess. smile

 

 
  [ # 15 ]

Yes, that sort of thing should really match all modifying verbs. Can’t AIML incorporate word lists like Chatscript does?
So, you had an outcome for “What was too small? -> Y” and “What was too big? -> X”, then?
Grammar parsing has its problems too, such as the mystery of the space that parks: the park-ing space. And space, as we all know, is really really big.

 

 1 2 3 >  Last ›
1 of 4
 
  login or register to react