308
edits
No edit summary |
(add normalizeArgument) |
||
Line 12: | Line 12: | ||
# split words/arguments + case markers | # split words/arguments + case markers | ||
# pick possible verbs | # pick possible verbs | ||
# | # pick possible clitics | ||
# group into arguments (argument structure parsing) | # group into arguments (argument structure parsing) | ||
# anaphora (magic word) substitution | # anaphora (magic word) substitution | ||
# | # suggest normalized arguments | ||
# suggest verbs for parses without one | |||
# noun type detection | # noun type detection | ||
# | # replace arguments with their nountype suggestions | ||
# | # rank | ||
==parser files== | ==parser files== | ||
Line 88: | Line 89: | ||
Each language has a set of "anaphora" or "magic words", like the English <code>["this", "that", "it", "selection", "him", "her", "them"]</code>. This step will search for any occurrences of these in the parses' arguments and make substituted alternatives, if there is a selection text. | Each language has a set of "anaphora" or "magic words", like the English <code>["this", "that", "it", "selection", "him", "her", "them"]</code>. This step will search for any occurrences of these in the parses' arguments and make substituted alternatives, if there is a selection text. | ||
=step 6: noun type detection= | =step 6: suggest normalized arguments= | ||
[http://mitcho.com/blog/projects/solving-another-romantic-problem/ > see blog post on argument normalization] and its use cases | |||
For languages with a <code>normalizeArgument()</code> method, this method is applied to each argument. If any normalized alternatives are returned, a copy of the parse is made with that suggestion. Prefixes and suffixes stripped off through argument normalization is put in the <code>inactivePrefix</code> and <code>inactiveSuffix</code> properties of the argument. | |||
=step 7: noun type detection= | |||
For each parse, send each argument string to the noun type detector. The noun type detector will cache detection results, so it only checks each string once. This returns a list of possible noun types with their "scores". | For each parse, send each argument string to the noun type detector. The noun type detector will cache detection results, so it only checks each string once. This returns a list of possible noun types with their "scores". | ||
Line 95: | Line 102: | ||
'my calendar' -> [{type: service, score: 1},{type: arb, score: .7}] | 'my calendar' -> [{type: service, score: 1},{type: arb, score: .7}] | ||
=step | =step 9: replace arguments with nountype suggestions= | ||
=step 10: ranking= | |||
foreach parse (w/o V) | foreach parse (w/o V) |
edits