Andres Hohendahl - Dec 1, 2012:
I am building a more sophisticated patterns matching, based on a combination of common EBNF and some special operators, which are capable of targeting sintagmátic [syntactic?] segments,
Interesting. My initial impression is that BNF/EBNF is poorly suited to parsing natural language. For *representing* natural language they would be reasonable choices, but it seems to me that the goal of parsing is to determine the writer’s meaning as quickly as possible, and for that task the essence of what is needed is determination of the subject, object(s), and verb, as quickly as possible. I suppose BNF/EBNF would be useful for structuring the many different possible parsings, though, say in a list that would then be analyzed for correct interpretation. Interesting. That’s a representation I would not have considered for that application, but just might work.
http://en.wikipedia.org/wiki/Extended_Backus–Naur_Form
http://english.stackexchange.com/questions/32447/is-there-an-ebnf-that-covers-all-of-english
Andres Hohendahl - Dec 1, 2012:
coupled with good spell corrector phonetically enhanced
I’m pretty sure I know what you’re doing there. I once had a job where I sometimes looked up in a database toys that customers were seeking, and often experienced the frustration of trying to find a toy that a customer had called by an unknown spelling. One example was “high tech”: high tech? hi-tech? high-tek? hi-tek? Just one pronunciation generated at least 4 possibilities, not counting hyphenation, and the database didn’t recognize any of the spellings I tried. I suggested to the company that they hire me to program their system to handle such queries but evidently they decided I was better suited to doing such trivial work than doing skilled programming work. :-(
Andres Hohendahl - Dec 1, 2012:
they generate an AST TREE
An Abstract Syntax Tree (AST) is another structure that strikes me as more suitable for computer languages than natural language.
()
http://en.wikipedia.org/wiki/Abstract_syntax_tree
()
“For the trees used in computer science engineering, see Abstract syntax tree.”
http://en.wikipedia.org/wiki/Parse_tree
Andres Hohendahl - Dec 1, 2012:
For example it can evaluate a complex math expression, in the middle of a sentence, finding the exact position of the mostly logical math piece.
If you mean your system can distinguish between a math expression versus natural language and can encapsulate such embedded math expressions, I can see where that would be very useful when parsing technical documents.
Yes, your approach is definitely at the “sophisticated” end of parsing! Good luck, and let’s hope it pays off in better machine understanding! Thanks for letting us know that there are some serious parsing folks out there.
[P.S.—I changed some of your spellings to what I believed you meant.]