https://www.cognilytica.com/2018/07/31/voice-assistant-benchmark-1-0-july-2018-results
I thought this was fairly interesting because it poses many of the questions we’re familiar with in chatbot contests, and tested them on Alexa, Siri, Google Home and Cortana.
There is one thing they aren’t taking into account though: Namely that nobody poses questions like Winograd Schemas for any practical purpose. So it’s well possible that the assistants can resolve pronouns but can’t answer a question that no customer is expected to ask.