Labs/Ubiquity/Parser 2/Localization Tutorial: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
Line 18: Line 18:
As you read along, you may find it beneficial to follow along in some of the more complete language settings files included in Parser 2: [https://ubiquity.mozilla.com/hg/ubiquity-firefox/raw-file/tip/ubiquity/modules/parser/new/en.js English], [https://ubiquity.mozilla.com/hg/ubiquity-firefox/raw-file/tip/ubiquity/modules/parser/new/ja.js Japanese], [https://ubiquity.mozilla.com/hg/ubiquity-firefox/raw-file/tip/ubiquity/modules/parser/new/da.js Danish].
As you read along, you may find it beneficial to follow along in some of the more complete language settings files included in Parser 2: [https://ubiquity.mozilla.com/hg/ubiquity-firefox/raw-file/tip/ubiquity/modules/parser/new/en.js English], [https://ubiquity.mozilla.com/hg/ubiquity-firefox/raw-file/tip/ubiquity/modules/parser/new/ja.js Japanese], [https://ubiquity.mozilla.com/hg/ubiquity-firefox/raw-file/tip/ubiquity/modules/parser/new/da.js Danish].


== The structure of the language file ==
== Writing your language settings ==
 
=== The structure of the language file ===


Each language in Parser 2 gets its own settings file. You'll need to look up the [http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes ISO 639-1 code for your language]... Here we'll use English (code <code>en</code>) as an example here and the language settings file would then be called <code>en.js</code> and go in the <code>/ubiquity/modules/parser/new/</code> directory of the repository.
Each language in Parser 2 gets its own settings file. You'll need to look up the [http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes ISO 639-1 code for your language]... Here we'll use English (code <code>en</code>) as an example here and the language settings file would then be called <code>en.js</code> and go in the <code>/ubiquity/modules/parser/new/</code> directory of the repository.
Line 34: Line 36:
Now let's walk through some of the parameters you must set to get your language working. For reference, the properties the language parser object is required to have are: <code>branching</code>, <code>anaphora</code>, and <code>roles</code>.
Now let's walk through some of the parameters you must set to get your language working. For reference, the properties the language parser object is required to have are: <code>branching</code>, <code>anaphora</code>, and <code>roles</code>.


== Identifying your branching parameter ==
=== Identifying your branching parameter ===


   en.branching = 'right'; // or 'left'
   en.branching = 'right'; // or 'left'
Line 58: Line 60:
In general, if your language has prepositions, you should use <code>.branching = 'right'</code> and if your language has postpositions, you can use <code>.branching = 'left'</code>.
In general, if your language has prepositions, you should use <code>.branching = 'right'</code> and if your language has postpositions, you can use <code>.branching = 'left'</code>.


=== For more info ===
==== For more info ====


* see [http://en.wikipedia.org/wiki/Branching_%28linguistics%29 branching] on Wikipedia.
* see [http://en.wikipedia.org/wiki/Branching_%28linguistics%29 branching] on Wikipedia.


== Defining your roles ==
=== Defining your roles ===


   en.roles = [
   en.roles = [
Line 76: Line 78:
The second required property is the inventory of semantic roles and their corresponding delimiters. Each entry has a <code>role</code> from the [[https://wiki.mozilla.org/Labs/Ubiquity/Parser_2/Semantic_Roles|inventory of semantic roles]] and a corresponding delimiter. Note that this mapping can be many-to-many, i.e., each role can have multiple possible delimiters and different roles can have shared delimiters. Try to make sure to cover all of the roles in the [[Labs/Ubiquity/Parser_2/Semantic_Roles|inventory of semantic roles]].
The second required property is the inventory of semantic roles and their corresponding delimiters. Each entry has a <code>role</code> from the [[https://wiki.mozilla.org/Labs/Ubiquity/Parser_2/Semantic_Roles|inventory of semantic roles]] and a corresponding delimiter. Note that this mapping can be many-to-many, i.e., each role can have multiple possible delimiters and different roles can have shared delimiters. Try to make sure to cover all of the roles in the [[Labs/Ubiquity/Parser_2/Semantic_Roles|inventory of semantic roles]].


=== For more info ===
==== For more info ====


* [http://mitcho.com/blog/projects/writing-commands-with-semantic-roles/ Writing commands with semantic roles], the original proposal
* [http://mitcho.com/blog/projects/writing-commands-with-semantic-roles/ Writing commands with semantic roles], the original proposal
Line 82: Line 84:
* Wikipedia entry on [http://en.wikipedia.org/wiki/Thematic_relation themantic relations]
* Wikipedia entry on [http://en.wikipedia.org/wiki/Thematic_relation themantic relations]


== Entering your anaphora ("magic words") ==
=== Entering your anaphora ("magic words") ===


   en.anaphora = ["this", "that", "it", "selection", "him", "her", "them"];
   en.anaphora = ["this", "that", "it", "selection", "him", "her", "them"];


The final required property is the <code>anaphora</code> property which takes a list of "magic words". Currently there is no distinction between all the different [http://en.wikipedia.org/wiki/Deixis deictic] [http://en.wikipedia.org/wiki/Anaphora_%28linguistics%29 anaphora] which might refer to different things.
The final required property is the <code>anaphora</code> property which takes a list of "magic words". Currently there is no distinction between all the different [http://en.wikipedia.org/wiki/Deixis deictic] [http://en.wikipedia.org/wiki/Anaphora_%28linguistics%29 anaphora] which might refer to different things.
== Register your language ==
Before testing out your new language settings file, you must register that language with the parser. There is a parser resgistry file at <code>ubiquity/modules/parser/new/parser_registry.json</code>. Open it up and add a new line to the JSON object mapping your language code to the native name of your language or locale. For example, if we wanted to add Danish (language code <code>da</code>), we could add the following line:
  da: "Dansk",


== Special cases ==
== Special cases ==
308

edits

Navigation menu