1,007
edits
| Line 11: | Line 11: | ||
trans hello world to ja | trans hello world to ja | ||
=== | === Language-specific parser plugins=== | ||
To make internationalization easier, all operations which are (human)-language-dependent are handled by language-specific plugins. For instance, in some languages the verb comes before the object, in other languages after the object, etc., so figuring out which part of the sentence is the verb is language-dependent. | |||
See [https://ubiquity.mozilla.com/hg/ubiquity-firefox/file/71b710040206/ubiquity/modules/parser/parser.js#l65 NLParser.makeParserForLanguage() in parser.js] for how the parser chooses a particular language plugin to use; this is called on initialization. See [https://ubiquity.mozilla.com/hg/ubiquity-firefox/file/71b710040206/ubiquity/modules/parser/parser.js#l188 NLParser.Parser.updateSuggestionList() in parser.js] for how the parser dispatches the user input to the parser plugin. | |||
So if Ubiquity is in English mode, then the sentence is dispatched to the English-specific parser plugin for | So if Ubiquity is in English mode, then the sentence is dispatched to the English-specific parser plugin for the first stage of parsing. The entry point here is the function [https://ubiquity.mozilla.com/hg/ubiquity-firefox/file/71b710040206/ubiquity/modules/parser/locale_en.js#l129 EnParser.parseSentence()] in file [https://ubiquity.mozilla.com/hg/ubiquity-firefox/file/71b710040206/ubiquity/modules/parser/locale_en.js#l1 locale_en.js]. | ||
(Note: this simplistic way of getting the verb currently limits us to having only single-word command names. It needs to be changed to support multi-word commands.) | For an example of how parsing works differently in other languages, see file [https://ubiquity.mozilla.com/hg/ubiquity-firefox/file/71b710040206/ubiquity/modules/parser/locale_jp.js#l1 locale_jp.js], the Japanese parser plugin, which would be called instead of the English one if the language had been set to "jp". | ||
The job of the language-specific plugin is twofold: | |||
# To find the verb in the input, so that it can be matched against command names | |||
# To generate every valid assignment of input substrings to arguments of the command. | |||
=== Where's the Verb? === | |||
Once inside the language-specific plugin, things might happen in a different order depending on the language, but we'll keep focusing on the English case for now. | |||
The English parser starts by finding the verb. Currently it does this by [https://ubiquity.mozilla.com/hg/ubiquity-firefox/file/71b710040206/ubiquity/modules/parser/locale_en.js#l135 splitting the input string on spaces], and taking the first word (i.e. everything up to the first space) as the verb. In our example, <tt>trans</tt> is our verb, and <tt>hello world to ja</tt> is the rest of our sentence. | |||
(Note: this simplistic way of getting the verb currently limits us to having only single-word command names. It needs to be changed to support multi-word commands. We also need to consider whether we want to support the case of the user entering the object first and the verb afterward.) | |||
=== Looking for a Verb Match === | === Looking for a Verb Match === | ||
edits