Intellego/Meetings/Status/2014-01-31

From MozillaWiki
Jump to: navigation, search

https://intellego.etherpad.mozilla.org/ep/pad/view/ro.8xafZ2WPK5AlVPYWfo0Do/rev.1158

Meeting Details

  • Friday, January 31, 2014; 9 AM PST / 12 PM EST / 17:00 GMT
    • rescheduled from Thursday, January 30, 2014; 9 AM PST / 12 PM EST / 17:00 GMT
  • Vidyo Room: https://v.mozilla.com/flex.html?roomdirect.html&key=BTvYSZyJA2lW
  • Telephone Conference Bridge: +1 800-707-2533 (USA/CAN toll-free); password 369; conference number 99625
  • IRC Backchannel: #intellego
  • Attendees: Gordon & Jeff

Talking Points

  • Action item follow-up
  • Possible Phase 1 milestones

Previous Action Items

Action Items

  • [Kensie] fleshing out and putting it onto wiki
  • Determine Phase 1 milestones
    • Research scope of proposed milestones
      • [Gordon]:
        • Read a few of the more notable papers on MT output evaluation.
        • Contact mitcho about leveraging hyperlink constituencies for use in Intellego.
        • Look into what it would take to translate snippets, including investigating existing open source machine translation engines and tying them into Pontoon.
      • [Jeff]:
        • Research what it would entail to translate only specific terminology.
  • [All] Rework research questions within the travel metaphor paradigm

Research

Possible Phase 1

  • Terminology output on the web
    • Using TBX files, scrape translatable web content on a given web page, find 100% terminology matches using a TBX file, retrieve translation equivalent and inject as added parenthetical.
    • It's probably easy to extract the text from the DOM; perhaps a little more difficult to determine what text should be translated.
  • MT of Firefox snippets within Pontoon tool

Google Results for "University of Maryland Machine Translation"

(Gordon may be able to leverage connections from the University of Delaware to the University of Maryland to get people involved. These are just some of the top results from a random Google search.)

"The University of Maryland Statistical Machine Translation System for the Third Workshop on Machine Translation"
http://www.umiacs.umd.edu/~ymarton/pub/wmt09/DyerSetiawanMartonResnik_wmt09_UMD-SMT-sys.pdf
"Domain Adaptation for Machine Translation by Mining Unseen Words"
http://www.umiacs.umd.edu/~jags/pdfs/lexical-adapt_short.pdf
"Language Model and Grammar Extraction Variation in Machine Translation"
http://www.cs.umd.edu/grad/scholarlypapers/papers/vladimir_eidelman_ms_scholarly_paper.pdf
"The University of Maryland Statistical Machine Translation System for the Fifth Workshop on Machine Translation"
http://aclweb.org/anthology//W/W10/W10-1707.pdf
"Fast, Easy, and Cheap
Construction of Statistical Machine Translation Models with MapReduce"
http://aclweb.org/anthology/W/W08/W08-0333.pdf
Proceedings of the Eighth Workshop on Statistical Machine Translation (2013)
http://aclweb.org/anthology/W/W13/#2200
List of research laboratories for machine translation
https://en.wikipedia.org/wiki/List_of_research_laboratories_for_machine_translation

Hyperlink consituencies

mitcho did some research into hyperlink constituencies which turned into a full-on project at MIT: