Intellego/Meetings/Status/2014-01-31

From MozillaWiki
< Intellego‎ | Meetings‎ | Status
Revision as of 05:56, 6 February 2014 by GPHemsley (talk | contribs) (c&p)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Meeting Details

  • 17:00 GMT/UTC

https://v.mozilla.com/flex.html?roomdirect.html&key=BTvYSZyJA2lW

  • +1 800 707 2533, pin 369 - then 99625 (US/INTL)
  • irc.mozilla.org #intellego backchannel
  • Attendees: Gordon & Jeff

Talking Points

  • Action item follow-up
  • Possible Phase 1 milestones

Previous Action Items

   https://intellego.etherpad.mozilla.org/sprint-2014-01-02
  • [Jeff] Email Bill with studies and ask for next steps.


Action Items

  • [Kensie] fleshing out and putting it onto wiki https://intellego.etherpad.mozilla.org/backgrounder
  • Determine Phase 1 milestones
    • Research scope of proposed milestones
      • [Gordon]:
        • Read a few of the more notable papers on MT output evaluation.
        • Contact mitcho about leveraging hyperlink constituencies for use in Intellego.
        • Look into what it would take to translate snippets, including investigating existing open source machine translation engines and tying them into Pontoon.
      • [Jeff]:
        • Research what it would entail to translate only specific terminology.
  • [All] Rework research questions within the travel metaphor paradigm
   https://intellego.etherpad.mozilla.org/sprint-2014-01-02


Research

Possible Phase 1

  • Terminology output on the web
    • Using TBX files, scrape translatable web content on a given web page, find 100% terminology matches using a TBX file, retrieve translation equivalent and inject as added parenthetical.
    • It's probably easy to extract the text from the DOM; perhaps a little more difficult to determine what text should be translated.

Google Results for "University of Maryland Machine Translation"

(Gordon may be able to leverage connections from the University of Delaware to the University of Maryland to get people involved. These are just some of the top results from a random Google search.)

"The University of Maryland Statistical Machine Translation System for the Third Workshop on Machine Translation" http://www.umiacs.umd.edu/~ymarton/pub/wmt09/DyerSetiawanMartonResnik_wmt09_UMD-SMT-sys.pdf

"Domain Adaptation for Machine Translation by Mining Unseen Words" http://www.umiacs.umd.edu/~jags/pdfs/lexical-adapt_short.pdf

"Language Model and Grammar Extraction Variation in Machine Translation" http://www.cs.umd.edu/grad/scholarlypapers/papers/vladimir_eidelman_ms_scholarly_paper.pdf

"The University of Maryland Statistical Machine Translation System for the Fifth Workshop on Machine Translation" http://aclweb.org/anthology//W/W10/W10-1707.pdf

"Fast, Easy, and Cheap: Construction of Statistical Machine Translation Models with MapReduce" http://aclweb.org/anthology/W/W08/W08-0333.pdf

Proceedings of the Eighth Workshop on Statistical Machine Translation (2013) http://aclweb.org/anthology/W/W13/#2200

List of research laboratories for machine translation https://en.wikipedia.org/wiki/List_of_research_laboratories_for_machine_translation

Hyperlink consituencies

mitcho did some research into hyperlink constituencies which turned into a full-on project at MIT: http://mitcho.com/academic/erlewine-le2012-slides.pdf http://constituency.mit.edu/ https://github.com/mitcho/constituency http://constituency.mit.edu/docs/overview/ http://constituency.mit.edu/docs/coding-guidelines/