Labs/Ubiquity/Parser 2: Difference between revisions

Labs/Ubiquity/Parser 2 (view source)

Revision as of 04:33, 13 May 2009

637 bytes added , 13 May 2009

add normalizeArgument

Mitcho

308

edits

@@ Line 12: / Line 12: @@
 # split words/arguments + case markers
 # pick possible verbs
-# (pick possible clitics - for the (near) future)
+# pick possible clitics
 # group into arguments (argument structure parsing)
 # anaphora (magic word) substitution
-# verb suggestion
+# suggest normalized arguments
+# suggest verbs for parses without one
 # noun type detection
-# argument noun suggestion
+# replace arguments with their nountype suggestions
-# score + rank
+# rank
 ==parser files==
@@ Line 88: / Line 89: @@
 Each language has a set of "anaphora" or "magic words", like the English <code>["this", "that", "it", "selection", "him", "her", "them"]</code>. This step will search for any occurrences of these in the parses' arguments and make substituted alternatives, if there is a selection text.
-=step 6: noun type detection=
+=step 6: suggest normalized arguments=
+[http://mitcho.com/blog/projects/solving-another-romantic-problem/ > see blog post on argument normalization] and its use cases
+For languages with a <code>normalizeArgument()</code> method, this method is applied to each argument. If any normalized alternatives are returned, a copy of the parse is made with that suggestion. Prefixes and suffixes stripped off through argument normalization is put in the <code>inactivePrefix</code> and <code>inactiveSuffix</code> properties of the argument.
+=step 7: noun type detection=
 For each parse, send each argument string to the noun type detector. The noun type detector will cache detection results, so it only checks each string once. This returns a list of possible noun types with their "scores".
@@ Line 95: / Line 102: @@
   'my calendar' -> [{type: service, score: 1},{type: arb, score: .7}]
-=step 7: ranking=
+=step 9: replace arguments with nountype suggestions=
+=step 10: ranking=
   foreach parse (w/o V)