308
edits
No edit summary |
|||
Line 18: | Line 18: | ||
As you read along, you may find it beneficial to follow along in some of the more complete language settings files included in Parser 2: [https://ubiquity.mozilla.com/hg/ubiquity-firefox/raw-file/tip/ubiquity/modules/parser/new/en.js English], [https://ubiquity.mozilla.com/hg/ubiquity-firefox/raw-file/tip/ubiquity/modules/parser/new/ja.js Japanese], [https://ubiquity.mozilla.com/hg/ubiquity-firefox/raw-file/tip/ubiquity/modules/parser/new/da.js Danish]. | As you read along, you may find it beneficial to follow along in some of the more complete language settings files included in Parser 2: [https://ubiquity.mozilla.com/hg/ubiquity-firefox/raw-file/tip/ubiquity/modules/parser/new/en.js English], [https://ubiquity.mozilla.com/hg/ubiquity-firefox/raw-file/tip/ubiquity/modules/parser/new/ja.js Japanese], [https://ubiquity.mozilla.com/hg/ubiquity-firefox/raw-file/tip/ubiquity/modules/parser/new/da.js Danish]. | ||
== The structure of the language file == | == Writing your language settings == | ||
=== The structure of the language file === | |||
Each language in Parser 2 gets its own settings file. You'll need to look up the [http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes ISO 639-1 code for your language]... Here we'll use English (code <code>en</code>) as an example here and the language settings file would then be called <code>en.js</code> and go in the <code>/ubiquity/modules/parser/new/</code> directory of the repository. | Each language in Parser 2 gets its own settings file. You'll need to look up the [http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes ISO 639-1 code for your language]... Here we'll use English (code <code>en</code>) as an example here and the language settings file would then be called <code>en.js</code> and go in the <code>/ubiquity/modules/parser/new/</code> directory of the repository. | ||
Line 34: | Line 36: | ||
Now let's walk through some of the parameters you must set to get your language working. For reference, the properties the language parser object is required to have are: <code>branching</code>, <code>anaphora</code>, and <code>roles</code>. | Now let's walk through some of the parameters you must set to get your language working. For reference, the properties the language parser object is required to have are: <code>branching</code>, <code>anaphora</code>, and <code>roles</code>. | ||
== Identifying your branching parameter == | === Identifying your branching parameter === | ||
en.branching = 'right'; // or 'left' | en.branching = 'right'; // or 'left' | ||
Line 58: | Line 60: | ||
In general, if your language has prepositions, you should use <code>.branching = 'right'</code> and if your language has postpositions, you can use <code>.branching = 'left'</code>. | In general, if your language has prepositions, you should use <code>.branching = 'right'</code> and if your language has postpositions, you can use <code>.branching = 'left'</code>. | ||
=== For more info === | ==== For more info ==== | ||
* see [http://en.wikipedia.org/wiki/Branching_%28linguistics%29 branching] on Wikipedia. | * see [http://en.wikipedia.org/wiki/Branching_%28linguistics%29 branching] on Wikipedia. | ||
== Defining your roles == | === Defining your roles === | ||
en.roles = [ | en.roles = [ | ||
Line 76: | Line 78: | ||
The second required property is the inventory of semantic roles and their corresponding delimiters. Each entry has a <code>role</code> from the [[https://wiki.mozilla.org/Labs/Ubiquity/Parser_2/Semantic_Roles|inventory of semantic roles]] and a corresponding delimiter. Note that this mapping can be many-to-many, i.e., each role can have multiple possible delimiters and different roles can have shared delimiters. Try to make sure to cover all of the roles in the [[Labs/Ubiquity/Parser_2/Semantic_Roles|inventory of semantic roles]]. | The second required property is the inventory of semantic roles and their corresponding delimiters. Each entry has a <code>role</code> from the [[https://wiki.mozilla.org/Labs/Ubiquity/Parser_2/Semantic_Roles|inventory of semantic roles]] and a corresponding delimiter. Note that this mapping can be many-to-many, i.e., each role can have multiple possible delimiters and different roles can have shared delimiters. Try to make sure to cover all of the roles in the [[Labs/Ubiquity/Parser_2/Semantic_Roles|inventory of semantic roles]]. | ||
=== For more info === | ==== For more info ==== | ||
* [http://mitcho.com/blog/projects/writing-commands-with-semantic-roles/ Writing commands with semantic roles], the original proposal | * [http://mitcho.com/blog/projects/writing-commands-with-semantic-roles/ Writing commands with semantic roles], the original proposal | ||
Line 82: | Line 84: | ||
* Wikipedia entry on [http://en.wikipedia.org/wiki/Thematic_relation themantic relations] | * Wikipedia entry on [http://en.wikipedia.org/wiki/Thematic_relation themantic relations] | ||
== Entering your anaphora ("magic words") == | === Entering your anaphora ("magic words") === | ||
en.anaphora = ["this", "that", "it", "selection", "him", "her", "them"]; | en.anaphora = ["this", "that", "it", "selection", "him", "her", "them"]; | ||
The final required property is the <code>anaphora</code> property which takes a list of "magic words". Currently there is no distinction between all the different [http://en.wikipedia.org/wiki/Deixis deictic] [http://en.wikipedia.org/wiki/Anaphora_%28linguistics%29 anaphora] which might refer to different things. | The final required property is the <code>anaphora</code> property which takes a list of "magic words". Currently there is no distinction between all the different [http://en.wikipedia.org/wiki/Deixis deictic] [http://en.wikipedia.org/wiki/Anaphora_%28linguistics%29 anaphora] which might refer to different things. | ||
== Register your language == | |||
Before testing out your new language settings file, you must register that language with the parser. There is a parser resgistry file at <code>ubiquity/modules/parser/new/parser_registry.json</code>. Open it up and add a new line to the JSON object mapping your language code to the native name of your language or locale. For example, if we wanted to add Danish (language code <code>da</code>), we could add the following line: | |||
da: "Dansk", | |||
== Special cases == | == Special cases == |
edits