Labs/Ubiquity/Parser Documentation

The parser architecture, as depicted by Jono DiCarlo.

The Epic Journey

We will now follow the journey of a string through Ubiquity's parser.

Suppose that the user enters the following into the parser:

 trans hello world to ja

Finding the Verb

First, the parser finds the verb in the sentence.

To make internationalization easier, all operations which are (human)-language-dependent are handled by language-specific plugins. Finding the verb is language-dependent: In some languages it comes before the object, in other languages after the object, etc.

So if Ubiquity is in English mode, then the sentence is dispatched to the English-specific parser plugin for verb detection. Currently this is done by simply taking the part of the string that occurs before the first space character: in other words, trans is our verb, and hello world to ja is the rest of our sentence.

(Note: this simplistic way of getting the verb currently limits us to having only single-word command names. It needs to be changed to support multi-word commands.)

Looking for a Verb Match

The parser maintains a list of every command that's installed in Ubiquity, and the next step in the parsing process is to take the input verb and compare it to the name of each command to figure out which ones are potential matches, rank them by match quality, and eliminate the rest.

When attempting to match an input verb to a command name, the parser tries the following types of matches:

Direct match of input verb to the beginning of the command name, e.g. matching "trans" to "translate". This is considered the best type of match.
Match in the middle or end of the command name, e.g. matching "ans" to "translate". This type of match is considered not as good: that is, if you type "ans" then the parser assumes you are more likely to mean "answer" than "translate". But a mid-word match is still better than nothing.
Match to one of the command's synonyms. A command can define any number of synonyms. For example, "tweet" is a synonym for "twitter", so if the input is "twee" then "twitter" will go in this category. A match to a synonym is not as good as a match to the primary command name.
Match in the middle or end of a synonym. For example, matching "eet" to "twitter" because it matches the end of "tweet". This type of match is unlikely to be what the user is looking for, and so we try only to suggest it if there are no better matches.

If there is no verb match: Noun-First Suggestions

If the input string is empty

(Noun-first suggestion based on the selection)

Scoring the Quality of the Verb Match

Scoring the Frequency of the Verb Match

Meanwhile: Parsing the Arguments

(Also language-dependent)

Assigning input substrings to command arguments

To create PartiallyParsedSentences

Interpolating magic pronouns like "this" (or not)

Getting suggestions from NounTypes

This can be asynchronous

Labs/Ubiquity/Parser Documentation

Contents

The Epic Journey

Finding the Verb

Looking for a Verb Match

If there is no verb match: Noun-First Suggestions

If the input string is empty

Scoring the Quality of the Verb Match

Scoring the Frequency of the Verb Match

Meanwhile: Parsing the Arguments

Assigning input substrings to command arguments

Interpolating magic pronouns like "this" (or not)

Getting suggestions from NounTypes

Rejecting parsings with invalid argument assignments

Turning PartiallyParsedSentences into FullyParsedSentences

Filling missing arguments with the selection

Filling missing arguments with the NounType's defaults

Scoring the Quality of the Noun Matches

Binding arguments to the FullyParsedSentences

Sorting the suggestions and returning them

Navigation menu

Labs/Ubiquity/Parser Documentation

The Epic Journey

Finding the Verb

Looking for a Verb Match

If there is no verb match: Noun-First Suggestions

If the input string is empty

Scoring the Quality of the Verb Match

Scoring the Frequency of the Verb Match

Meanwhile: Parsing the Arguments

Assigning input substrings to command arguments

Interpolating magic pronouns like "this" (or not)

Getting suggestions from NounTypes

Rejecting parsings with invalid argument assignments

Turning PartiallyParsedSentences into FullyParsedSentences

Filling missing arguments with the selection

Filling missing arguments with the NounType's defaults

Scoring the Quality of the Noun Matches

Binding arguments to the FullyParsedSentences

Sorting the suggestions and returning them

Navigation menu

Search