Intellego/Meetings/Status/2014-01-31
Meeting Details
- 17:00 GMT/UTC
https://v.mozilla.com/flex.html?roomdirect.html&key=BTvYSZyJA2lW
- +1 800 707 2533, pin 369 - then 99625 (US/INTL)
- irc.mozilla.org #intellego backchannel
- Attendees: Gordon & Jeff
Talking Points
- Action item follow-up
- Possible Phase 1 milestones
Previous Action Items
- Rework research questions within the travel metaphor paradigm
- [Kensie] fleshing out and putting it onto wiki https://intellego.etherpad.mozilla.org/backgrounder
- Determine Phase 1 milestones
- Research scope of proposed milestones
- [All] Rework research questions within the travel metaphor paradigm
https://intellego.etherpad.mozilla.org/sprint-2014-01-02
- [Jeff] Email Bill with studies and ask for next steps.
Action Items
- [Kensie] fleshing out and putting it onto wiki https://intellego.etherpad.mozilla.org/backgrounder
- Determine Phase 1 milestones
- Research scope of proposed milestones
- [Gordon]:
- Read a few of the more notable papers on MT output evaluation.
- Contact mitcho about leveraging hyperlink constituencies for use in Intellego.
- Look into what it would take to translate snippets, including investigating existing open source machine translation engines and tying them into Pontoon.
- [Jeff]:
- Research what it would entail to translate only specific terminology.
- [Gordon]:
- Research scope of proposed milestones
- [All] Rework research questions within the travel metaphor paradigm
https://intellego.etherpad.mozilla.org/sprint-2014-01-02
Research
Possible Phase 1
- Terminology output on the web
- Using TBX files, scrape translatable web content on a given web page, find 100% terminology matches using a TBX file, retrieve translation equivalent and inject as added parenthetical.
- It's probably easy to extract the text from the DOM; perhaps a little more difficult to determine what text should be translated.
- MT of Firefox snippets within Pontoon tool
Google Results for "University of Maryland Machine Translation"
(Gordon may be able to leverage connections from the University of Delaware to the University of Maryland to get people involved. These are just some of the top results from a random Google search.)
"The University of Maryland Statistical Machine Translation System for the Third Workshop on Machine Translation" http://www.umiacs.umd.edu/~ymarton/pub/wmt09/DyerSetiawanMartonResnik_wmt09_UMD-SMT-sys.pdf
"Domain Adaptation for Machine Translation by Mining Unseen Words" http://www.umiacs.umd.edu/~jags/pdfs/lexical-adapt_short.pdf
"Language Model and Grammar Extraction Variation in Machine Translation" http://www.cs.umd.edu/grad/scholarlypapers/papers/vladimir_eidelman_ms_scholarly_paper.pdf
"The University of Maryland Statistical Machine Translation System for the Fifth Workshop on Machine Translation" http://aclweb.org/anthology//W/W10/W10-1707.pdf
"Fast, Easy, and Cheap: Construction of Statistical Machine Translation Models with MapReduce" http://aclweb.org/anthology/W/W08/W08-0333.pdf
Proceedings of the Eighth Workshop on Statistical Machine Translation (2013) http://aclweb.org/anthology/W/W13/#2200
List of research laboratories for machine translation https://en.wikipedia.org/wiki/List_of_research_laboratories_for_machine_translation
Hyperlink consituencies
mitcho did some research into hyperlink constituencies which turned into a full-on project at MIT: http://mitcho.com/academic/erlewine-le2012-slides.pdf http://constituency.mit.edu/ https://github.com/mitcho/constituency http://constituency.mit.edu/docs/overview/ http://constituency.mit.edu/docs/coding-guidelines/