|
Posted: Oct 12, 2013 |
[ # 16 ]
|
|
Guru
Total posts: 1009
Joined: Jun 13, 2013
|
It all comes down _to_ham Steve Worswick - Oct 12, 2013: What do you do with the results once parsed?
Mostly, the same things you do with a “*” in a template.
<template>my name is *</template>
A tagger would mark a word “John” here as being the object of the sentence just as a pattern matcher would. But for “my name is awesome”, a tagger might recognise “awesome” as being an adjective, which prevents the program from going “Hello awesome!”
This is a frail example, and you can certainly add more patterns specifically to deal with “* is awesome.” or “* is nice.” or “* is really really difficult to spell.”, but that’s pretty much the point: Grammar taggers cover equal ground without having to spell out as many patterns as there are ways to ask a question. Personally I find it very convenient to have the subject-verb-object presented to me consistently no matter how the sentence is formulated.
I agree with Bruce, it’s not as useful a method for chatbots as it is for e.g. AI created to extract knowledge from large amounts of text. The military loves this sort of thing though.
|
|
|
|
|
Posted: Oct 12, 2013 |
[ # 17 ]
|
|
Administrator
Total posts: 2048
Joined: Jun 25, 2010
|
I assume most people get round the above problem by having a list of names and checking them, so “I am Steve” will call a different template to “I am happy”. I do see what you mean though and am not being deliberately obtuse but just struggle to see any practical use for POS tagging that cannot easily be worked around.
I asked a few months ago if anyone could show me ANY bot that didn’t use pattern matching (and was half decent). I got no replies.
|
|
|
|
|
Posted: Oct 12, 2013 |
[ # 18 ]
|
|
Guru
Total posts: 1009
Joined: Jun 13, 2013
|
Ah, alright. I’m glad you understand . I don’t think that there is anything that can’t be worked around, you’ve shown on more than one occasion that AIML can pretty much be used to the extent of a programming language. I think both systems are just easier to implement for one task than another.
Wikipedia says the best application of grammar parsers is detecting ambiguous meaning, so I offer you the phrase “My program’s program programs programs.”
Come to test it, apparently I overlooked a grammar rule for plural possessive “My programs’ program”. So alas, I’ve no half decent bot to show either , but this talk has at least been of use.
|
|
|
|
|
Posted: Oct 12, 2013 |
[ # 19 ]
|
|
Moderator
Total posts: 2372
Joined: Jan 12, 2010
|
First off, chatbots don’t have to deal with your phrase example. 2nd, the more difficult one is
Fish fish fish fish fish fish fish.
|
|
|
|
|
Posted: Oct 12, 2013 |
[ # 20 ]
|
|
Moderator
Total posts: 2372
Joined: Jan 12, 2010
|
And the question of “work around” involves how many rules you have to write to do the work around. Yes, one can work around, but sometimes the labor is a lot. That’s why it’s BRILLIANT that AIML is finally getting 0-width wildcards and simple table mapping.
|
|
|
|
|
Posted: Oct 12, 2013 |
[ # 21 ]
|
|
Guru
Total posts: 1009
Joined: Jun 13, 2013
|
I’m not sure what to make of “fish fish fish fish fish fish fish” myself, but my program understood “fish fish fish” as a command to go fish for fish-fish, which he identified as a compound word for a particular kind of fish . ... I’m not sure if he’s wrong.
|
|
|
|
|
Posted: Oct 12, 2013 |
[ # 22 ]
|
|
Moderator
Total posts: 2372
Joined: Jan 12, 2010
|
(SUBJECT) fish [that] fish fish (VERB) fish (OBJECT) fish [that] fish fish
omitted clause starters, and fish that eat fish-eating fish. Clear when written as
Perch sharks fish fish guppies sharks eat
|
|
|
|
|
Posted: Oct 12, 2013 |
[ # 23 ]
|
|
Guru
Total posts: 1009
Joined: Jun 13, 2013
|
Shouldn’t that be “fish fishing fish - fish - fish fish fish”? Either that or this form wasn’t taught in my English classes.
No wait, I get it. Challenge accepted.
|
|
|
|
|
Posted: Oct 12, 2013 |
[ # 24 ]
|
|
Moderator
Total posts: 2372
Joined: Jan 12, 2010
|
the omitted “that” means one can say fish fish since is it is equivalent to “that fish for food” or whatever.
|
|
|
|
|
Posted: Oct 12, 2013 |
[ # 25 ]
|
|
Guru
Total posts: 1009
Joined: Jun 13, 2013
|
So it’s the kind of fish that fish are fishing upon, who fish for the kind of fish that other fish… are fishing… upon. Lesson learned: A great NLP program would throw up its hands and ask for clarification. Mine would go fishing, and that’s not a bad idea either.
|
|
|
|
|
Posted: Oct 12, 2013 |
[ # 26 ]
|
|
Guru
Total posts: 1081
Joined: Dec 17, 2010
|
I believe the classic line is:
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.
http://en.wikipedia.org/wiki/Buffalo_buffalo_Buffalo_buffalo_buffalo_buffalo_Buffalo_buffalo
I have yet to see a decent conversational chatbot/AI that uses pos tagging as it’s foundation. Although pos tagging can help in disambiguation, it is a very limited tool in conversational AI.
|
|
|
|
|
Posted: Oct 13, 2013 |
[ # 27 ]
|
|
Guru
Total posts: 1009
Joined: Jun 13, 2013
|
Agreed. Also, I keep forgetting that many taggers rely heavily on statistical probability, which I now see Andres might be referring to as a hell of resources (the data files to remember which word meanings coincide with which other words must be huge). My earlier comments about efficiency and accuracy don’t apply there, I don’t think statistics add that much accuracy.
|
|
|
|
|
Posted: Oct 13, 2013 |
[ # 28 ]
|
|
Moderator
Total posts: 2372
Joined: Jan 12, 2010
|
If you are on a server, the memory needed by a statistical pos tagger doesn’t matter. Their BIG advantage is ease of getting one in another language after training. But they are weak in that they degrade with changes in kind of text they deal with, and chat is NOT their native domain usually. The best of the lot at 97.5% maybe are not as good as the best rule-based one at 99.something % for English.
|
|
|
|
|
Posted: Oct 15, 2013 |
[ # 29 ]
|
|
Moderator
Total posts: 2372
Joined: Jan 12, 2010
|
Meanwhile, Steve, having said that patterns don’t desperately need pos tagging, ChatScript actually has a use for it (which is why it’s built in). When the user says something we hunt for a rule to match it. When we fail, the bot has control of the conversation and can issue a “gambit”, a voluntary message on some subject. Topics have keywords associated with them, and ChatScript tries to do this from the most relevant topic. So if the user says “I had a home run yesterday”, the bot could steer to the baseball topic and gambit “I was never good at baseball.” as a gambit. The keyword list can designate POS words so topic: ~baseball (bat ball glove umpire base strike run~n) only reacts to run used as a noun, and not to the user saying “I like to run on weekends”. Similarly the system can restrict patterns on rules, but it’s generally less important (particularly because a rule like (I * run) in the baseball topic wouldn’t get matched if you are outside the topic and the topic keywords have run~n because no rules in the topic will be considered.)
|
|
|
|