Hi,
I have a problem of two rules when the first pattern can catch the first word in the input and the second should catch the second or later word in the input.
The problems is in cases the first and the second word has the same marks and the only way to distinguish between those two is the fact that the first word were not failed (catched / extracted in my terminology) in the first rule.
Example:
u: EXTRACT_MAIN_SUBJECT (_~mainsubject)
$main_subject = _0
# the main subject is processed and should be ignored from now on
u: EXTRACT_OBJECT (_~noun) # catch any noun [strong]after[/strong] the main subject
$object = _0
For input like: “John loves Mary”
It creates the variables:
$object = John
$main_subject = John
Because the second rule also applied to the first word.
I’m not sure if my approach is the best way to solve the main issue or the rules should be reorganized in some other way.
Secondly, I have some questions about pos tags and other concepts marks:
After some rule were applied and matched some word in the sentence. What’s the best way to get the relevant marks (to my needs) into variables?
For example:
u: EXTRACT_MAIN_SUBJECT (_~mainsubject)
$main_subject = _0
$main_subject_is_noun_proper_singular = ???
$main_subject_is_geographical_areas = ???
When I do:prepare sentence, I see that all the relevant information is already there and I don’t want to ^query for membership in concepts to get the data I need.
Lastly, Out of curiosity: I see that the pos tagger is aware of the context.
For example:
“I’ll email you the details” (email=verb)
“Send me an email” (email=noun)
Is the parser rule based or statistics based and how much of the context it takes into consideration?
Thanks for any response,
Sam