Dear CR, Merlin, Jan and Erwin - thank you all for your kind replies.
To reply in turn:
@CR: my gold standard would be to have a pair of learning chatbots with rich language processing, and an uncanny resemblance to Person A and B when they ‘speak’. I suspect that this would take a larger investment of time than I can afford at this point to reach that standard, though, so I’m trying to understand what I could hack up over a week of evenings or so on the basis of these conversation feeds.
I began to experiment with chatscript, but had the sense that it required a lot of flow-control and structuring effort. Thus, I wanted to know whether there was a less labour intensive approach to reading in chat text.
@Merlin: FakeKirk sounds perfect: thank you. I’ve looked through the download files at http://code.google.com/p/aiml-en-us-foundation-fakekirk/. My impression was that they can be divided into three categories:
1. content-independent files that would require minimal modification to reverse engineer (e.g. charname.aiml, dialog history.aiml)
2. marked-up content files that could be generated from the SMS feeds in a fairly mechanical way (e.g. kirk.aiml)
3. flow control files that did would require careful structuring (e.g. kirk-update.aiml, kirk-update1.aiml, update.aiml).
In your second item, what did you mean by “programmatic”? By “the current format”, you meant of the original text inputs, or the version of AIML used?
In conclusion, though, you’d recommend AIML over chatscript for this sort of task, if only because I could follow FakeKirk? (n.b. following the real Kirk always seemed a fairly perilous task; I’m a bit concerned that following a fake one may be just as dangerous.)
As to background, I have rusty C skills, and basic HTML. Thus, for a first project, I’ll be limited to fairly simple reverse engineering. A typical text input file would be a CSV file, with each row representing a text message, and cells for sender, receiver, date, time, and message contents.
@Jan: I can run this under Linux, Windows or Android - my main criterion will be ease of implementation. Thus, I’d prefer to start with something as off the shelf as possible.
@Erwin: thank you - I’ll try to contribute once I feel that my signal-noise ratio improves.
Thank you all again,
Colin