Please take a look at https://github.com/Program-O/Program-O/blob/master/chatbot/core/aiml/find_aiml.php for the functions written to perform the search and score the potential matches. that should paint a more complete picture of how it’s done than I could by explaining it in detail. Still, here’s a brief overview.
The entire process begins within the function get_aiml_to_parse() (line 843). This function is more or less the heart of the script’s task of finding the correct AIML category to parse. It starts off with getting the current topic, seeing if there are any user defined AIML categories that could be used, searching the DB for potential matches, weeding out any irrelevant matches, scoring the remaining entries, then selecting the best of what’s left.
The function find_aiml_matches() (line #886) builds a comprehensive SQL query to search the database and create an array of potential matches. It first “collects” current chatbot’s ID, the current topic (if set), the last response from the chatbot (if set), and the user’s current input, then crafts a SQL query to return the smallest number of potential matches that would result in a guaranteed match, based on the rules in the AIML 1.0 specification. This function returns a numerically indexed array of “synopses” of AIML categories (though not the categories themselves, in order to reduce memory use and maximize performance). this array is then passed down the line for processing.
The next step occurs in the function unset_all_bad_pattern_matches() (line 120), which searches through the array for entries that are irrelevant to the current topic, last bot response, current chatbot, and user’s input. Each irrelevant entry is then removed from the array, leaving only those entries that stand a good chance of being a “best match”. The resulting array is then sent along for scoring.
The scoring function, score_matches (line 315) then takes the array of potential matches and starts adding point values to each element, based on a somewhat complex set of rules. I won’t go into how this works, since it’s a very detailed explanation, and I simply don’t have the time to go into it right now. Suffice it to say that several criteria are checked against, and each element of the array of potential matches is awarded points depending on how many criteria are met. The array is then passed back for sorting and selection of the highest scoring element.
That’s about it. After the “winning” AIML category is selected, it’s parsed go generate the chatbot’s output. I hope this serves to clarify things. If it doesn’t, then I humbly suggest that you study the code for more insights.