Intellego/GSoC/2014: Difference between revisions

Jump to navigation Jump to search
Clean up formatting and wikimarkup.
m (GPHemsley moved page Intellego/GSoC to Intellego/GSoC/2014: In case we have more GSoC projects in the future.)
(Clean up formatting and wikimarkup.)
Line 1: Line 1:
== GSoC Project Outline ==
Intellego is participating in the Google Summer of Code program for 2014.
 
== Project outline ==
Intellego is an initiative to develop a machine translation platform from open corpus data, open corpus gathering techniques, and open web services APIs to lower the linguistic accesibility barrier for users and websites and further promote the exploration of freedom of linguistic expression on the web.  
Intellego is an initiative to develop a machine translation platform from open corpus data, open corpus gathering techniques, and open web services APIs to lower the linguistic accesibility barrier for users and websites and further promote the exploration of freedom of linguistic expression on the web.  


Line 6: Line 8:
If the student can accomplish the basic scope of the project before the necessary eight weeks, the stretch aim would be to enable the addition of context sensitive retrieval of target terminology.
If the student can accomplish the basic scope of the project before the necessary eight weeks, the stretch aim would be to enable the addition of context sensitive retrieval of target terminology.


=== Skills Needed ===
=== Skills needed ===
* DOM manipulation (JavaScript)
* DOM manipulation (JavaScript)
* Information retrieval
* Information retrieval
Line 15: Line 17:


=== Timeline ===
=== Timeline ===
8-week project timeline (three months for whole thing beginning to end but only 8 weeks allowed to be allocated to actual code work--See timeline link below):
The GSoC program has allotted 8 weeks for coding. Here is the timeline we have come up with for that period:


;Week 1
==== Week 1 ====
* Create a bilingual termbase of terminology consisting of Mozilla-specific terminology from Mozilla l10n resources.
* Create a bilingual termbase of terminology consisting of Mozilla-specific terminology from Mozilla l10n resources.
;Week 2
 
==== Week 2 ====
* Create a front-end web portal UI in which the user will simply enter a URL and click a button to execute the MT results.
* Create a front-end web portal UI in which the user will simply enter a URL and click a button to execute the MT results.
* Create a back-end, Python-based program that will, given a URL, extract the DOM text nodes from the associated webpage.
* Create a back-end, Python-based program that will, given a URL, extract the DOM text nodes from the associated webpage.
;Week 3
 
==== Week 3 ====
* Filter out DOM text nodes with untranslatable (or non-translatable) text.
* Filter out DOM text nodes with untranslatable (or non-translatable) text.
;Week 4
 
==== Week 4 ====
* Search the translatable DOM text nodes (the source) for source terminology matches in the bilingual termbase.
* Search the translatable DOM text nodes (the source) for source terminology matches in the bilingual termbase.
;Week 5
 
==== Week 5 ====
* Map the source terminology to the matching target terminology from the termbase.
* Map the source terminology to the matching target terminology from the termbase.
;Week 6
 
==== Week 6 ====
* All-At-Once Replacement Method: Regenerate the DOM with the replaced terminology, output to a new webpage, and render it.
* All-At-Once Replacement Method: Regenerate the DOM with the replaced terminology, output to a new webpage, and render it.
;Week 7
 
==== Week 7 ====
* On-the-Fly Replacement Method: Perform the terminology replacement operation on the DOM segment by segment, instead of extracting all text nodes from the DOM at once.
* On-the-Fly Replacement Method: Perform the terminology replacement operation on the DOM segment by segment, instead of extracting all text nodes from the DOM at once.
;Week 8
 
==== Week 8 ====
* Evaluate each method (all-at-once or on-the-fly) for efficiency and analyze whether it would be beneficial to use one method over the other, or whether it would be better to offer a choice of either.
* Evaluate each method (all-at-once or on-the-fly) for efficiency and analyze whether it would be beneficial to use one method over the other, or whether it would be better to offer a choice of either.
;Final deliverable:
 
=== Final deliverable ===
* Automatic terminology translation tool consisting of a web interface and a server-side tool. A user will insert the URL of a source language web site and the tool will return the rendered target language website containing partially translated content.
* Automatic terminology translation tool consisting of a web interface and a server-side tool. A user will insert the URL of a source language web site and the tool will return the rendered target language website containing partially translated content.


 
== Interested students ==
== Google Summer of Code Interested Parties ==
If you are a student interested in participating with Intellego for the Google Summer of Code program, please add your information to the table below.
Please add a row below the header row containing the appropriate information if you are interested in this project.
{| class="wikitable"
{| class="standard-table" border="1" style="border-collapse: collapse"
! Name
|-
! Contact information
! Name  
! Website
! Contact info
! Open source experience
! Profile/site
! Description of interest  
! Description of interest  
|-
|-
| [http://mozillians.org/en/akshayaurora Akshay Aurora] (:system64)
| {{Mozillian|akshayaurora|Akshay Aurora}} (:system64)
| akshayaurora[at]yahoo.com
| akshayaurora[at]yahoo.com
| [http://iakshay.net Website] // [http://github.com/iakshay Github] // [http://linkedin.com/in/akshayaurora LinkedIn]
| [http://iakshay.net Website] // [http://linkedin.com/in/akshayaurora LinkedIn]
| [http://github.com/iakshay Github]
| Full stack developer passionate about open technologies
| Full stack developer passionate about open technologies
|-
|-
|-
| {{Mozillian|haseeb|Abdul Rauf}} (:haseeb)
| [http://mozillians.org/en/haseeb Abdul Rauf] (:haseeb)
| abdulraufhaseeb[at]gmail.com
| abdulraufhaseeb[at]gmail.com
| [http://www.haseeb.info Website] // [http://github.com/haseebgit Github] // [http://bitbucket.org/haseebbit BitBucket]
| [http://www.haseeb.info Website]
| [http://github.com/haseebgit Github] // [http://bitbucket.org/haseebbit BitBucket]
|
|
|-
|-
|-
| Tharshan Muthulingam
| Tharshan Muthulingam
| tharshan09[at]gmail.com
| tharshan09[at]gmail.com
| [http://tharshan.me Website] // [http://github.com/viperfx Github]
| [http://tharshan.me Website]
| [http://github.com/viperfx Github]
|  
|  
|-
|}
|}


== Contact ==
== Team liaisons ==
;Mentor & Reporter
; Mentor
* [https://mozillians.org/en-US/u/gueroJeff/ Jeff Beatty]
; Reporter
: {{Mozillian|gueroJeff|Jeff Beatty}}
canmove, Confirmed users, Bureaucrats and Sysops emeriti
960

edits

Navigation menu