AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

New Annual Contest with $25000 First Prize
 
 
  [ # 31 ]

I may be wrong, but I thought the whole point of the Winograd schemas was the challenge of the pronoun resolution. The essence of the schema is you have a sentence(s) with a pronoun near the end, and by varying a single word in the schema, the pronoun refers to either the subject or object of the sentence. The challenge is to know what the pronoun refers to. In your example:

- ‘The doctor refers the patient to a specialist, because he was better. Who does he refer to?’

The AI should answer: ‘He’ refers to the specialist.

And if we ask the AI:

- ‘The doctor refers the patient to a specialist, because he was worse. Who does he refer to?’

The AI should answer: ‘He’ refers to the patient.

As I understood it, this kind of pronoun resolution was the point of the Winograd schemas. And the kind of questions that ask ‘who was better’ or ‘who was worse’ are slightly missing the point. That said I may not have fully understood the schemas. Also last year’s Loebner qualifiers did not inclode any kind of pronoun resolution questions.

 

 
  [ # 32 ]

They would certainly be more challenging and my method would fail straight away, as you would need the full sentence to be able to answer what “it” referred to.

The trophy doesn’t fit into the brown suitcase because it’s too [small/large]. What does “it” refer to?

I guess this is maybe a “let’s walk before we can run” type situation?

 

 
  [ # 33 ]
Will Rayer - Jun 19, 2015:

- ‘The doctor refers the patient to a specialist, because he was worse. Who does he refer to?’
The AI should answer: ‘He’ refers to the patient.

I was trying to say that the “he” in this question can not only refer to the patient/specialist, but can simulaneously refer to the doctor who literally “refers to”. In this case the question is ambiguous as well as the premesis.

The point of the Winograd Schemas is to stimulate the creation of AI that applies common sense reasoning. Answering the question or resolving the pronoun both require such a process, some more than others. Drawing a fact of size from a spatial relation is such a process, even if it can be simplified.

Personally I do partly rely on the second part, and my AI will flunk on “what does it refer to” because that prompts my AI not only to resolve “it”, but also to look for what “it” was saying that might have been unclear to the user. As in “What are you referring to?”

 

 
  [ # 34 ]

I’ll put it more clearly: You’re creating an AI that resolves pronouns, yes? And you have two pronouns in your input. This is how it will read the input:

The trophy doesn’t fit into the brown suitcase because the trophy is too large. What does the trophy refer to?

I recommend reading the paper if you haven’t yet. It’s a good read.

 

 
  [ # 35 ]

New details
Dead line moved to January 2016, Input in XML, restricted internet access allowed, 2 rounds of 60 questions, 90% score to win the grand prize.

There is one point that makes me reconsider entering. I wouldn’t mind sharing my Winograd methods, but I’m not planning to lay my AI’s internals bare so that a 3000-employee commercial company can reproduce it.

This competition is meant to advance science. A prerequisite for receiving a prize is demonstrating, through sharing code, publishing reproducible algorithms, etc. Further details will be made available.

 

 
  [ # 36 ]

Is anyone have news about this contest?

The deadline is in one month and there is no registration form where notified in rules.

8. Registration form: All entries should fill out the registration form which will be available at www.CommonsenseReasoning.org/winograd.html

I have sent a mail a long time ago and I never had received any response.

 

 

 
  [ # 37 ]

I just received word from Charles Ortiz that the website will be updated with some of the missing details in a couple of days.
Meanwhile David Bender has established the human benchmark for solving Winograd Schemas to be around 92% (I think. He throws a lot of numbers around). Although that is not my experience, the volunteers for this experiment had more incentive to think hard because they were paid for each correct answer.

I have made some progress, but it still looks like I can only solve 1/4th of the examples with certainty, while the rest depends on inferences on specific knowledge that’s not likely to be in my program’s database or even its vocabulary. I’ve heard of several universities working on Winograd Schemas but it is unclear whether they will pull through. Who’s still up for it here?

 

 
  [ # 38 ]

News flash: The contest is -again- postponed, this time to IJCAI-16 in New York. I am told this will also be officially announced, but I’ve been told things a few times too many now. The good news is I can relax for the holidays.

 

 
  [ # 39 ]

Now official:

The deadline for registration and submission of executable code is June 15, 2016. The competition itself will be held at IJCAI 2016, July 9-15. 2016, in New York City.

I was planning to reveal a few methods after the submission date, but since that’s been delayed, I instead wrote the more introductionary parts down in an article with a few details that may interest you, such as the contest’s vulnerabilities to trickery and the state of the art scores to beat. Discussion is welcome if I overlooked something.

 

 
  [ # 40 ]

Just a heads up for anyone interested that it’s a little less than two months to the deadline.

I myself have been programming about 20 general axioms/inferences that cover half of the initial examples, and I’m going to leave it at that. I found reasoning a bit underused compared to the need for knowledge. Also remarkable is that the examples are ironically uncommon cases, and that one might be better helped focusing on more common cases of ambiguity in practice. I’ll blog some of my methods after the submission date, if they finally go through with it.

Who else is still game?

 

 
  [ # 41 ]

Slight adjustment to the deadline of submission. Same date as the Loebner Prize deadline now.

The deadline for registration and submission of executable code is July 1, 2016.

 

 
  [ # 42 ]

http://www.cs.nyu.edu/faculty/davise/papers/WSCExample.xml
They’ve changed the XML input format since the last time I checked, without notice. Gonna have to redo the interface angry Most important to mention is that they’ve replaced the usual question at the end of each Winograd Schema with half a statement.
Their example output has also changed but not their description of it on the site, so I’m just going to do whatever.

 

 
  [ # 43 ]

The organisation is a bit messy, so in case you want to participate and haven’t been in touch with them yet, I advise you to email them in reply to the mail I got:

You have indicated an interest in participating in the Winograd Schema Challenge to be held at IJCAI 2016 this summer.  I am contacting you to ask a few questions:

1.  Will you indeed be participating?
2.  Does your software require use of the internet during processing?
3.  Does your software require the use of any special hardware that you would need to bring to IJCAI to run the tests?    If not, can you provide us with an executable ahead of time so that we can run on at least parts of the test beforehand?  What would be your preferred way to get us an executable?  (Note that the tests would probably be run on standard laptops: Macs or PCs).    We have the room for the challenge only for the morning of the 12th and would probably not have time to test everyone during that period.

charles.ortiz -insert symbol here- nuance.com

 

 
  [ # 44 ]

I hope none of the entrants need the internet to work?! Otherwise, I’ll enter myself and have a IM client at the other end with me typing away.

 

 
  [ # 45 ]

A number of universities do make use of Google and WordNet for this. The organisation said they’re limiting the sites you can access, and that they’ll be closely monitoring which sites are being accessed. That strikes me as possible, given some know-how and effort.

 

 < 1 2 3 4 5 >  Last ›
3 of 6
 
  login or register to react