Intellego/Research: Difference between revisions

Intellego/Research (view source)

Revision as of 14:27, 20 March 2014

1,317 bytes added , 20 March 2014

no edit summary

Jbeatty

Account confirmers, canmove, Confirmed users

2,357

edits

@@ Line 8: / Line 8: @@
 *Data collected for machine translation corpuses is often done via web crawling and consuming data that users unknowingly offer to these engines either due to web crawling or due to agreeing to obscure terms and conditions of using that MT service. Open data collection for MT corpuses is either non-existent or an obscure practice.
-=Our proposed solutions=
 =Research questions=
 ===How does machine translation work?===
-There are two general approaches to Machine Translation. Most of the early work, before massive corpora, was done with Rule-based machine translation ( [http://en.wikipedia.org/wiki/Rule-based_machine_translation http://en.wikipedia.org/wiki/Rule-based_machine_translation] ).  However, most of the current work being done is with Statistical Machine Translation ( [http://en.wikipedia.org/wiki/Statistical_machine_translation http://en.wikipedia.org/wiki/Statistical_machine_translation] ).  A brief description of each is available below.
+There are four general approaches to Machine Translation. Most of the early work, before massive corpora, was done with Rule-based machine translation ( [http://en.wikipedia.org/wiki/Rule-based_machine_translation http://en.wikipedia.org/wiki/Rule-based_machine_translation] ).  However, most of the current work being done is with Statistical Machine Translation ( [http://en.wikipedia.org/wiki/Statistical_machine_translation http://en.wikipedia.org/wiki/Statistical_machine_translation] ).  A brief description of each is available below.
 ====Rule-Based Machine Translation====
@@ Line 18: / Line 17: @@
 ====Statistical Machine Translation====
 Uses statistical information to choose the "best" translation from the possible translations of a text.  As far as I know, all work with statistical machine translation requires a bilingual corpus for calculating the necessary probabilities.
+====Example-based Machine Translation====
+Uses cases and analogies, along with a parallel corpus, to determine the best translation. Somewhat similar to Rule-Based ([http://en.wikipedia.org/wiki/Example-based_machine_translation http://en.wikipedia.org/wiki/Example-based_machine_translation]).
+====Hybrid Machine Translation====
+A combination of the previously mentioned approaches.
 ===What are the benefits and drawbacks to each methodology?===
-===What parts does a machine translation engine consist of?===
 ===How do you measure the output quality of a machine translation engine?===
+;Automated evaluation
 * BLEU Score - http://en.wikipedia.org/wiki/BLEU
+** Compares MT output against reference translations consisting of professional human translation, assigning a score (based on n-gram precision) to determine how close to the human translation the MT output arrives.
+* NIST - http://en.wikipedia.org/wiki/NIST_(metric)
+** Similar to BLEU, however, not all correct n-grams are created equal. Correct n-grams are weighted according to rarity of occurrence.
+* METEOR - http://en.wikipedia.org/wiki/METEOR
+** Evaluation based on unigram recall consistency, rather than precision (as BLEU and NIST do).
+* LEPOR - http://en.wikipedia.org/wiki/LEPOR
+** New MT evaluation model that is based on evaluating precision, recall, sentence-length and n-gram based word order.
 ===What prominent machine translation engines are out there and what are they known for?===
 {| class="wikitable sortable" border="1"
@@ Line 34: / Line 44: @@
 ! Open/Closed
 ! # of supported languages
-! Noteworthy
+! Web hosted?
 |-
 | Google Translate
@@ Line 41: / Line 51: @@
 | Closed
 | +70
-|
+| translate.google.com
 |-
 | Microsoft Translator
@@ Line 51: / Line 61: @@
 |-
 | Babelfish
+| Yahoo!
 |
-|
+| Closed
-|
 |
 |
@@ Line 59: / Line 69: @@
 | MosesMT
 |
-|
+| Statistical
-|
+| Open
 |
 |
 |-
-| Other
+| Apertium
-|
-|
 |
+| Rule-based
+| Open
 |
 |
@@ Line 93: / Line 103: @@
 |}
-See also [https://en.wikipedia.org/wiki/Comparison_of_machine_translation_applications https://en.wikipedia.org/wiki/Comparison of machine_translation applications]
+See also [https://en.wikipedia.org/wiki/Comparison_of_machine_translation_applications https://en.wikipedia.org/wiki/Comparison of machine_translation applications] & [http://www.computing.dcu.ie/~mforcada/fosmt.html http://www.computing.dcu.ie/~mforcada/fosmt.html].
 ===What prominent corpuses are currently available?===
@@ Line 166: / Line 176: @@
 ===What human resources would be needed to build our own MT engine?===
 ===What partnership opportunities could be available for this project?===
+See [https://www.taus.net/taus-machine-translation-showcase https://www.taus.net/taus-machine-translation-showcase].
 =User stories=
 ==Firefox end-users==

Intellego/Research: Difference between revisions

Intellego/Research (view source)

Revision as of 14:27, 20 March 2014

Navigation menu

Search