Let’s see if I can provide some answers for you:
1.) The “best” AIML set for you is rather subjective. The most recent AIML sets that I know of are the sets at the Google Code Repository. They can be found at http://code.google.com/p/aiml-en-us-foundation-alice/. The AIML set that I use for my bot Morti are based on the Annotated ALICE AIML (AAA) set that can be found at http://www.alicebot.org/aiml/aaa/. That should give you plenty of options for choosing the best set for your purposes.
2.) Program E has no “training” interface, like Program O or Pandorabots, so the short answer is “you can’t”.
3.) I’m pretty sure that Program E isn’t designed to use UTF8 character encoding. That’s not to say that you can’t, mind you. You could always try it, and see if it “breaks” the script. I’ve never tried, to be honest, so I don’t really know one way or the other.
4.) Again, “better” is somewhat subjective here. Program O has fewer bugs, and has more features, such as a well designed Admin section where you can easily add AIML files, train your bot, alter your bot’s personality variables (e.g. Favorite (whatever)) and a host of other things. I’m part of the dev team for Program O, and I’m the primary guy that gives support over at the Program O Forums, so if you decide to use it instead of Program E, you’ll be hearing from me a lot. As to Program O being usable with apps like IMified, the short answer is “not out of the box, no”. However, that being said, I can probably help you find a way to make it work. Morti uses a custom interface that I wrote for Program O that uses jQuery AJAX calls that are similar to what you would need, so it is possible to do. It’s just not easy.
5.) If there are other AIML interpreters that are written in PHP, I haven’t heard of them. I’ve also never heard of Cl0ne, either, so I can’t really answer that question fully. I’m also not the world’s greatest Pandorabots fan, but for other reasons, namely because I don’t have enough control over my bot there. Maybe I’m just a control freak, but to me, Pandorabots is just too limited for my tastes.
Don’t worry about asking a lot of questions, all at once. Well, at least don’t worry about it if the number of questions at one time is less than twenty.