Labs/Ubiquity/Parser Documentation: Difference between revisions

Labs/Ubiquity/Parser Documentation (view source)

Revision as of 23:20, 5 February 2009

1,673 bytes added , 5 February 2009

→‎Scoring and Sorting Suggestions

Jdicarlo

1,007

edits

@@ Line 294: / Line 294: @@
 The algorithm doesn't currently recognize disjoint or out-of-order matches.  E.g. if the user typed "add cal", they might mean "add-to-calendar", and we might detect that fact if we did disjoint matches, but we don't so this will get a score of 0.  It might be worth experimenting with matches like this, but how to rank them?
-=== Scoring the Frequency of the Verb Match ===
+=== Scoring the Frequency of the Verb Choice ===
 ( [https://ubiquity.mozilla.com/hg/ubiquity-firefox/file/71b710040206/ubiquity/modules/parser/parser.js#l127 NLParser.Parser._sortSuggestionList() in parser.js] )
@@ Line 316: / Line 316: @@
 This is an extremely simplistic noun match quality ranking.  Replacing it with a real measurement, one that takes into account string distance or past choice frequency of noun suggestions, is a major opportunity for future evolution of the parser.
+=== Sorting the suggestions and returning them ===
+NLParser.FullyParsedSentence.getMatchScores() in parser.js
+NLParser.Parser._sortSuggestionList() in parser.js
+The relative importance of the various scores depends on whether the suggestions were generated by the verb-first or by the noun-first parsing strategy.  (It's guaranteed that all the suggestions in the suggestion list at any given time were either all generated verb-first or all generated noun-first.)
+Verb-first suggestions are sorted thus:
+# first by the frequency of the verb choice
+# Ties on frequency are broken by verb match quality score.
+# Ties on verb match quality score are broken by noun match quality score.
+Multiple suggestions based on the same verb always tie on frequency and verb match quality for obvious reasons, so within the same verb suggestions end up being sorted only by noun match quality.
+Noun-first suggestions have no verb match quality score.  They have a verb choice frequency score, but this has to be calculated differently because there is no verb substring in the input to compare with the suggestion memory.  Noun-first suggestions are sorted thus:
+# first by noun match quality score
+# Ties on noun match quality are broken by frequency of the verb choice
+Because suggestions only get noun match quality from '''specific''' noun types (i.e. not <tt>noun_arb_text</tt>), this sorting order ensures that matches to specific data types in the input or selection get bumped to the top.  For instance, try entering this:
+ today
+You'll see that verbs that take dates as arguments (e.g. "check-calendar") appear at the top of the list, with your most commonly-used generic verbs (google, wikipedia, etc) under that.
 === Binding arguments to the FullyParsedSentences ===
@@ Line 337: / Line 365: @@
 See ubiquity/modules/cmdmanager.js to see an example of client code using the external API of the FullyParsedSentence objects returned as suggestions from NLParser.Parser.getSuggestionList() and NLParser.Parser.getSentence().
-=== Sorting the suggestions and returning them ===

Labs/Ubiquity/Parser Documentation: Difference between revisions

Labs/Ubiquity/Parser Documentation (view source)

Revision as of 23:20, 5 February 2009

Navigation menu

Search