Personal tools

Labs/Ubiquity/Parser 2/Semantic Roles

From MozillaWiki

Jump to: navigation, search

Supported Semantic Roles

Ubiquity commands under the new Parser 2 API use semantic roles, together with noun types, to specify the kinds of arguments it takes.

  • object: direct object (the default or unmarked argument)
  • goal: the goal or end point of (metaphorical) movement or transition
    • example: in English, arguments marked by “to”, “into”, “toward”, etc.
  • source: the source or starting point of (metaphorical) movement or transition
    • example: in English, arguments marked by “from”, “by”, etc.
  • location: refers to a physical location where an action occurs or that an action is related to, in contrast to goal and source.
    • example: in English, arguments marked by “near”, “in”, “at”, etc.
  • time: a time when the action takes place, or
    • example: in English, arguments marked by "in," "at," etc.
    • Note: in English (as well as in many other languages) the markers for time and location overlap greatly. It is thus very important to use an appropriate nountype which will reinforce the choice between temporal and physical arguments.
  • instrument: a tool or intermediary to be used
    • example: in English, arguments marked by “using” or “with”, as in “bookmark this with delicious.”
  • format: describes the intended or expected form of the result
    • example: in English, arguments marked by “in” as in “in PDF form” or “in German”
  • alias: a name or reference used to refer to something (ie, an identity) or to describe something
    • example: in English, arguments marked by “as” as in “tag this as new” or “login to mail as aza.”
  • modifier: a modifier describes a noun or a type of a noun, normally which is part of the command.
    • example: in English, things marked by "of" as in "get email address of Aza" or "with" in "close tabs with IE8".
    • Note: modifier is fundamentally different from all the other semantic roles, in that the other roles each correspond to an argument of the verb, while the modifier marks what is essentially an argument of a noun.

Looking at some of the other language parsers that have been written so far will give you a sense for what these roles correspond to in other languages you may know.

Note: Not all language parsers support all of the roles above. In particular, modifier is difficult to support in some languages (ie Chinese). In the future we may consider adding a mechanism by which localizations can affect the roles of arguments in commands to deal with such cases.

Why Semantic Roles?

The previous (Parser 1) API required that arguments of verbs be specified by "modifiers", i.e., English prepositions or Japanese postpositions (the only two languages that had parsers in Parser 1).

This meant that commands—including the modifiers and associated execute and preview code—had to be rewritten for each language. By referring to abstract semantic roles, Ubiquity's parser will take care of identifying these different kinds of arguments for you, regardless of the languages, and just pass your command's execute and previews the argument data.

For example, suppose you want to write a command to move something. It'll take three arguments:

  • object (noun_arb_text)
  • goal (noun_type_geolocation)
  • source (noun_type_geolocation)

Now I can enter something like "move truck from Paris to London" and, assuming it understands Paris and London, it'll parse this input properly in English and find all three arguments.

The magic happens when we then try to use this command in another language, like French: "move truck de Paris à London". Parser 2 will identify the French goal and French source and hand it to your command. You don't even need to know any French to get the basic command working in French. The same goes for other languages, including languages that look very different, like Japanese ("truckをParisからLondonへmove").

By using semantic roles to identify arguments, your commands will automatically be functional in other languages. All that remains to be localized, then, is the name of your verb and other metadata, as well as strings in the preview and execute code. Read more about localizing commands and making commands localizable.

Reference