I was asked this question by email because the author is still awaiting membership confirmation:
“How easy (or how hard) to port a new language to ChatScript? specially if this language is right-to-left language? (Arabic, Hebrew, Persian, Urdu, ...)
Do I need wordnet or deep NLP programming or what?
Do you suggest any resources illustrating adding new languages to ChatScript ?”
There are various aspects to language support in ChatScript. The system supports UTF8, so you can input and match patterns in any language.
The system comes presupplied with a bunch of concepts, which presumably you might want to write your own equivalents in some other language for whatever ones you want to use.
Most users don’t need the pos-tagging/parsing abilities that exist, and if you wanted them for some other language the easiest way is to query a server that does it for some other language and read the answers back into CS.
The dictionary has a variety of uses. Spell checking is one. Sometimes it is used to enumerate down a list of words to build a concept, but one can build a concept manually.
The biggest issue for foreign languages is getting the canonical form (stemming or whatever), since one usually wants to match the canonical form of a word rather than the original form. There is a file (canonical.txt) which can explicitly list words and their canonical, so one could merely list all the words you want t use and their canonical form in there. Otherwise writing the corresponding code of the system’s stemmers would be needed.
As for right to left languages, I’m not an expert in that.
Output from CS will be whatever the output is, so it isn’t aware of right to left.
Patterns normally match left to right, but it is possible to make it match backwards. But then again, presumably if the input is right to left, you can just match the input per usual left to right, matching the end of the pattern before the start.. I don’t know.