MDN/Archives/Kuma/Scripting: Difference between revisions

m
(Created page with "== Overview == The mission of the Kuma wiki is to replace MindTouch. One of the advanced features of MindTouch is [http://developer.mindtouch.com/en/docs/DekiScript DekiScript]...")
 
 
(36 intermediate revisions by 2 users not shown)
Line 19: Line 19:
* Local API with access to Wiki data queries (eg. custom tables of contents, etc)
* Local API with access to Wiki data queries (eg. custom tables of contents, etc)
* Network API with access to external web services (eg. RSS feeds, bugzilla, etc)
* Network API with access to external web services (eg. RSS feeds, bugzilla, etc)
* L10N-aware constructs and alternate content
** Example: [https://developer.mozilla.org/Template:XULAttrInheritedWide Template:XULAttrInheritedWide]
** eg. span lang="en" class="lang lang-en", span lang="de" class="lang lang-de", span lang="*" class="lang lang-*"
* Safe for use by wiki content authors
* Safe for use by wiki content authors
** (Not entirely sure what makes it safe, need more research)
** (Not entirely sure what makes it safe, need more research)
Line 24: Line 27:
*** Normal content editors can make calls to Template: scripts, but not employ all scripting features?
*** Normal content editors can make calls to Template: scripts, but not employ all scripting features?


== Proposed high-level infrastructure ==
=== Code samples ===
 
Here's some inline script excerpted from [https://developer.mozilla.org/en/Firefox_9_for_developers en/Firefox_9_for_developers]:<pre>
    <li>The <code>value</code> attribute of {{HTMLElement("li")}} now can be
    negative as specified in HTML5. Previously negative values were converted
    to 0.</li>
</pre>
 
Here's the source of [http://developer-dev.mozilla.org/Template:HTMLElement Template:HTMLElement], a commonly used script:<pre>
    /* accepts as input one required parameter: HTML element to create a xref to */
    var uri = uri.parts(Page.uri);
    var lang = string.tolower(uri.path[0]);
    if (string.contains(lang, "project") || string.contains(lang, "Project")) {
      let lang = string.substr(lang, 8);
    }
    /* fall back to page.language on a user page */
    else if (string.StartsWith(lang, "user:")) {
      let lang = page.language;
    }
    var name = $0;
    var sectionname = "Element";
 
    if (!string.compare("es", string.tolower(lang))) {
      let sectionname = "Elemento";
    }
 
    if (args.title) {
      let name = args.title;
    }
    var dest = lang .. '/' .. 'HTML/' .. sectionname .. '/' .. name;
    var destEng = 'en/HTML/Element/' .. name;
    if (wiki.pageExists(dest)) { /* the page exists */
      <code>(web.link(wiki.uri(dest), '<' .. name .. '>'))</code>;
    } else if (lang == 'zh_tw' && wiki.pageExists(destEng)){
      /* the MozTW community consider links to English pages better than red ones.
        I'll write about this to mozilla.dev.mdc later */ 
      <code>(web.link(wiki.uri(destEng), '<' .. name .. '>'))</code>;
    } else { /* the page doesn't exist */
      var targeturi = "https://developer.mozilla.org/Article_not_found?uri=" .. dest;
      <code><a rel=('internal') href=(targeturi) class=('new')>web.text('<' .. name .. '>')</a></code>
    }
</pre>
 
== Miscellaneous notes ==
 
* Would like a relatively boring solution that matches core competencies
** Components should have active existing communities
** Try to stay close to dev team core-competencies
** Try to make it sensible to host for care & feeding by IT
 
* Switch from Lua to JavaScript basis
** JavaScript is more of a webdev core-competency
** JS is pretty close to Lua
** node.js has an active community, works out-of-the-box
*** Maybe consider [https://github.com/zpao/spidernode SpiderNode], but only if we get benefits beyond not-invented-here purity
 
* Or, do something ''with'' Lua?
** [http://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2012-01-30/Technology_report Wikipedia seems fond of Lua for ''their'' templates]
 
* HTTP proxy service vs embedding JS
** Embedding JS in a Python app seems experimental, and outside webdev team core-competency
** HTTP is a webdev core-compentency, and has loose-coupling and RESTafarian benefits
** Taming a code execution environment in its own process seems safer and more easily scaled
 
* Taming semi- or untrusted JavaScript code
** [http://nodejs.org/docs/v0.3.1/api/vm.html#vm.runInNewContext Built-in node.js sandboxing features]
** [http://gf3.github.com/sandbox/ Sandbox] to wrap JS execution in node.js?
*** Seems to have some more features than built-in node.js sandboxing
*** timeouts, restricted method access, graceful errors
 
* Use standalone [https://github.com/visionmedia/ejs embedded JS templates] instead of literals embedded in script?
** Employs code-in-markup instead of markup-in-code, and avoids the need for fancy compiler hijinx
** This is being used in other projects
 
* Might [https://github.com/laverdet/js-xml-literal js-xml-literals] come in handy?
** [https://github.com/laverdet/js-xml-literal/issues/2 I failed on my first attempt to install], so haven't seen this in action.
** This seems experimental; no one else really seems to be using it yet
 
* Should we get a compiler / language expert in on this for opinions?
** This smells like dangerous excitement
 
* Could we port DekiScript directly? (It's open source, right?)
 
== Proposed solution #1 ==
 
The basic concept:
 
* Kuma (Django/Python) serves as the wiki/CMS for both general content and template scripts
* Document views are proxied through a node.js-based service that evaluates and executes inline macros.
* Support inline parameterized '''macros''' that invoke template scripts.
** Macros are usable by all content editors
** Macros are neutered function calls, basically, and not full JS expressions
** Macros are handled by a custom parser, and definitely not just <code>eval()</code>'ed
* Support '''template scripts''' that are evaluated in a node.js sandbox as [https://github.com/visionmedia/ejs EJS templates].
** '''Template scripts''' could be made editable only by elevated-privilege editors


Considering this from a high-level view of off-the-shelf parts that could be glued together:
Considering this from a high-level view of off-the-shelf parts that could be glued together:


* Kuma wiki CMS (Django/Python)
* Kuma wiki CMS (Django/Python)
* Sandboxed server-side JavaScript filter service (node.js)
* Sandboxed server-side JavaScript filter proxy (node.js)
* [https://github.com/visionmedia/ejs EJS templates]
* Convenience methods to perform wiki content queries and utility functions
** Something like what [http://developer.mindtouch.com/en/docs/DekiScript/Reference/Wiki_Functions_and_Variables DekiScript offers for wiki functions and variables]
* Plentiful and intelligent HTTP-based caching
* Plentiful and intelligent HTTP-based caching


== Miscellaneous notes ==
=== Prototype ===
 
lorchard has started work on a prototype in node.js:
* https://github.com/lmorchard/kumascript
 
=== Flow sketch ===
 
Here is a rough sketch of the process and network flows involved in this proposed solution:
 
[[File:Kumascript.png|1000px|Kuma script flow]]
 
More notes:
* If the caching works right, we should be able to almost never load & compile templates unless they're edited. (Cache invalidation? HEAD requests before GET requests?)
* When a Kuma wiki page is edited, the last-cached KumaScript response is discarded.
* To support preview functionality, we can accept HTTP POST of arbitrary page source at the start of the KumaScript flow. Not cached, which is a feature.
 
=== Code samples (imaginary) ===
 
We can limit inline expressions to invoking long-form templates, rather than free-form scripting. Usage still looks familiar with respect to content "in the wild", though:<pre>
    <li>The <code>value</code> attribute of {{HTMLElement("li")}} now can be
    negative as specified in HTML5. Previously negative values were converted
    to 0.</li>
</pre>
 
Long-form templates become [https://github.com/visionmedia/ejs embedded JS templates], something like this:<pre>
<%
    /* accepts as input one required parameter: HTML element to create a xref to */
    var uri = uri.parts(Page.uri);
    var lang = string.tolower(uri.path[0]);
    if (string.contains(lang, "project") || string.contains(lang, "Project")) {
      let lang = string.substr(lang, 8);
    }
    /* fall back to page.language on a user page */
    else if (string.StartsWith(lang, "user:")) {
      let lang = page.language;
    }
    var name = arguments[0];
    var sectionname = "Element";
 
    if (!string.compare("es", string.tolower(lang))) {
      sectionname = "Elemento";
    }
 
    if (args.title) {
      name = args.title;
    }
    var dest = lang + '/' + 'HTML/' + sectionname + '/' + name;
    var destEng = 'en/HTML/Element/' + name;
    if (wiki.pageExists(dest)) { /* the page exists */
      %> <code><%- web.link(wiki.uri(dest), '<' + name + '>')) %></code> <%
    } else if (lang == 'zh_tw' && wiki.pageExists(destEng)){
      /* the MozTW community consider links to English pages better than red ones.
        I'll write about this to mozilla.dev.mdc later */ 
      %> <code><%- web.link(wiki.uri(destEng), '<' + name + '>')) %></code> <%;
    } else { /* the page doesn't exist */
      var targeturi = "https://developer.mozilla.org/Article_not_found?uri=" .. dest;
      %> <code><a rel="internal" href="<%= targeturi %>" class="new"><%- web.text('<' + name + '>') %></a></code> <%
    }
%>
</pre>
 
The above presupposes an API similar to what's exposed to DekiScript, is
non-functional, and mainly serves as a thought experiment in comparative
syntax.
 
=== Sandboxed JavaScript execution ===
 
* Can this be done in a way that restricts file, network, memory, and CPU usage?
** Anything else dangerous and in need of restriction?
** How restrictive do we need to be? (ie. is trusting a certain set of authors enough?)
** How restrictive is DekiScript?
 
* Options inside node.js
** [http://nodejs.org/docs/v0.3.1/api/vm.html#vm.runInNewContext node.js has sandboxing out-of-the-box]
** There's also [http://gf3.github.com/sandbox/ Sandbox]


* Might [https://github.com/laverdet/js-xml-literal js-xml-literals] come in handy?
* Options for host running node.js
* Could we port DekiScript directly? (It's open source, right?)
** No filesystem access at all (chroot?)
* Should we get a compiler / language expert in on this for opinions?
*** and/or [http://lxc.sourceforge.net/ LXC]
** Whitelisted network access (eg. firewall rules? limit base URLs of services?)
** Limited execution time (eg. kill the process after 30000 msec?)
** Limited memory usage (eg. kill the process after 10MB consumed?)
** Auto-disable script if abuse detected (eg. penalty box for X minutes?)
canmove, Confirmed users
1,953

edits