AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Christmas Quiz aiml
 
 

With some help from Dave, here is my Christmas quiz if anybody would like a copy to play with.  Feel free to use.

Did try attaching to this message but got error message saying it could not be written to disk (maybe it is full?).  File can be downloaded from:  http://www.nicobloc.com/ajhtemp/xmasquiz.aiml

Enjoy.

 

 

 
  [ # 1 ]

Hi, Alan.

I finally got a chance to download the AIML file, and tried to upload it to a chatbot that I created just to test your smoking quiz chatbot, but I got the following errors:

String could not be parsed as XML at line 165
There was a problem adding file xmasquiz.aiml to the database. Please refer to the message below to correct the problem and try again.
String could not be parsed as XML
Fatal Error 9: Input is not proper UTF-8, indicate encoding ! Bytes: 0x96 0x20 0x74 0x68 on line 990
Fatal Error 4: Start tag expected, ‘<’ not found on line 2

The offending line is here, with the source of the error hilighted in red:

OK. <srai>CORRECTANSWER2</srai> There are 2 Christmas Islands – the one in the Pacific Ocean (Kiritimati) was discovered by Captain James Cook in 1777, the one in the Indian Ocean by Captain William Mynors in 1643. Both on Christmas Day.

It seems odd that a simple hyphen should “break” the AIML code, but not when you consider that that character isn’t a ‘-’ (which has an ASCII value of 45 - 0x2D hex). but a different character completely. Let me see if I can explain a bit better.

Program O was designed to handle not just English, but other languages, as well, such as Thai, Russian and Chinese, too; and as such, uses UTF8 for it’s character set, along with multi-byte string (mbstring) processing. That particular character (’–’) has a ‘character value’ of 150 (0x96 in hexadecimal, the first reported ‘illegal’ character in the line of AIML code, above), which has no corresponding character in UTF8 (there are no “legal” characters between 0x7E and 0xA0 in UTF8), so Program O’s XML validator reported an error over it.

The “fix” is simple, in that the offending character just needs to be replaced with a simple dash character (UTF8 value 0x2D, or ASCII 45 - the key just to the right of the 0 key on an English US keyboard) to correct the problem. The characters look identical to us humans, but to computers they’re completely different.

All of this probably sounds like complete gibberish, not to mention being a bit overly-technical, and potentially confusing. It’s not my intention to sound condesending, or to “talk down” to anyone here. To be completely honest, I don’t fully comprehend some of the nuances of character sets and such, and I spent well over a year trying to work all this stuff out in order to get Program O to just support Spanish! big surprise Luckily, getting the project to work in Spanish was the hard part. From there, getting it to work with Thai and Chinese was easy. cheese

Ok, off topic a bit; sorry. Long story short, replacing that one character in the file fixed the validation problem and allowed me to upload the file to the new chatbot, and I’m testing it now. I’ll report back later today with my results.

 

 
  [ # 2 ]

Thanks Dave, not sure how that happened as I only typed simple dash on my keyboard and I’ve been using the Pandorabots server to test it for quite a few hours without any issues. It may have got in there from where I sourced the Q & As from.

I’ve changed the offending dash to a comma to avoid the problem - have also added a few more alternative ‘right’ answers such as accepting ‘Cranberry’ as well as ‘Cranberry Sauce’ in answer to one of the questions.

 

 
  [ # 3 ]

Which text editor are you using when you work with your AIML files? I ask because I’ve seen similar issues when people use Microsoft Word (or similar, “full featured” word processor apps), which “auto-magically” convert single quotation marks to their “left” and “right” versions. This almost sounds like a similar issue, though I’ve never seen it for a hyphen before. It could also be a “copy/paste” problem, since the two characters in question look identical, even though they aren’t.

I’ve written an AIML validator at http://www.geekcavecreations.com/pgo/validateAIML.php that performs the same validation as the Program O upload script, but it provides a bit more detail about any errors found. I have plans to expand the validator at some point, to provide recommendations for correcting errors of this sort, but finding the time to do so (especially at this time of year) is a bit tricky. smile

Anyway, please feel free to use the validator, and if it throws an error that you need help unraveling, just give me a holler, and I’ll do what I can to assist.

 

 
  [ # 4 ]

For the Christmas Quiz, I did use Microsoft Excel to collect and sort the questions and answers and then cutted and pasted from there and that could be where the offending dash came from - I caught all the odd quote marks.  Generally I use editplus3 as my editor which I use for all html and aiml editing since it nicely highlights tags in different colours and the search tools are very useful when trying to find words or phrases across all aiml files at once.  Thanks for sharing the validator.

 

 
  [ # 5 ]

My pleasure. I love helping out whenever I can.

 

 
  login or register to react