MDN/Archives/Kuma/Scripting: Difference between revisions
LesOrchard (talk | contribs) No edit summary |
m (Jswisher moved page MDN/Kuma/Scripting to MDN/Archives/Kuma/Scripting) |
||
(33 intermediate revisions by 2 users not shown) | |||
Line 19: | Line 19: | ||
* Local API with access to Wiki data queries (eg. custom tables of contents, etc) | * Local API with access to Wiki data queries (eg. custom tables of contents, etc) | ||
* Network API with access to external web services (eg. RSS feeds, bugzilla, etc) | * Network API with access to external web services (eg. RSS feeds, bugzilla, etc) | ||
* L10N-aware constructs and alternate content | |||
** Example: [https://developer.mozilla.org/Template:XULAttrInheritedWide Template:XULAttrInheritedWide] | |||
** eg. span lang="en" class="lang lang-en", span lang="de" class="lang lang-de", span lang="*" class="lang lang-*" | |||
* Safe for use by wiki content authors | * Safe for use by wiki content authors | ||
** (Not entirely sure what makes it safe, need more research) | ** (Not entirely sure what makes it safe, need more research) | ||
Line 69: | Line 72: | ||
== Miscellaneous notes == | == Miscellaneous notes == | ||
* [http://nodejs.org/docs/v0.3.1/api/vm.html#vm.runInNewContext Built-in node.js sandboxing features] | * Would like a relatively boring solution that matches core competencies | ||
* [http://gf3.github.com/sandbox/ Sandbox] to wrap JS execution in node.js? | ** Components should have active existing communities | ||
** Seems to have some more features than built-in node.js sandboxing | ** Try to stay close to dev team core-competencies | ||
** timeouts, restricted method access, graceful errors | ** Try to make it sensible to host for care & feeding by IT | ||
* Use standalone [ | |||
* Switch from Lua to JavaScript basis | |||
** JavaScript is more of a webdev core-competency | |||
** JS is pretty close to Lua | |||
** node.js has an active community, works out-of-the-box | |||
*** Maybe consider [https://github.com/zpao/spidernode SpiderNode], but only if we get benefits beyond not-invented-here purity | |||
* Or, do something ''with'' Lua? | |||
** [http://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2012-01-30/Technology_report Wikipedia seems fond of Lua for ''their'' templates] | |||
* HTTP proxy service vs embedding JS | |||
** Embedding JS in a Python app seems experimental, and outside webdev team core-competency | |||
** HTTP is a webdev core-compentency, and has loose-coupling and RESTafarian benefits | |||
** Taming a code execution environment in its own process seems safer and more easily scaled | |||
* Taming semi- or untrusted JavaScript code | |||
** [http://nodejs.org/docs/v0.3.1/api/vm.html#vm.runInNewContext Built-in node.js sandboxing features] | |||
** [http://gf3.github.com/sandbox/ Sandbox] to wrap JS execution in node.js? | |||
*** Seems to have some more features than built-in node.js sandboxing | |||
*** timeouts, restricted method access, graceful errors | |||
* Use standalone [https://github.com/visionmedia/ejs embedded JS templates] instead of literals embedded in script? | |||
** Employs code-in-markup instead of markup-in-code, and avoids the need for fancy compiler hijinx | ** Employs code-in-markup instead of markup-in-code, and avoids the need for fancy compiler hijinx | ||
** This is being used in other projects | |||
* Might [https://github.com/laverdet/js-xml-literal js-xml-literals] come in handy? | * Might [https://github.com/laverdet/js-xml-literal js-xml-literals] come in handy? | ||
** [https://github.com/laverdet/js-xml-literal/issues/2 I failed on my first attempt to install], so haven't seen this in action. | ** [https://github.com/laverdet/js-xml-literal/issues/2 I failed on my first attempt to install], so haven't seen this in action. | ||
** This seems experimental; no one else really seems to be using it yet | |||
* Should we get a compiler / language expert in on this for opinions? | * Should we get a compiler / language expert in on this for opinions? | ||
** This smells like dangerous excitement | |||
* Could we port DekiScript directly? (It's open source, right?) | * Could we port DekiScript directly? (It's open source, right?) | ||
== Proposed solution #1 == | == Proposed solution #1 == | ||
The basic concept: | |||
* Kuma (Django/Python) serves as the wiki/CMS for both general content and template scripts | |||
* Document views are proxied through a node.js-based service that evaluates and executes inline macros. | |||
* Support inline parameterized '''macros''' that invoke template scripts. | |||
** Macros are usable by all content editors | |||
** Macros are neutered function calls, basically, and not full JS expressions | |||
** Macros are handled by a custom parser, and definitely not just <code>eval()</code>'ed | |||
* Support '''template scripts''' that are evaluated in a node.js sandbox as [https://github.com/visionmedia/ejs EJS templates]. | |||
** '''Template scripts''' could be made editable only by elevated-privilege editors | |||
Considering this from a high-level view of off-the-shelf parts that could be glued together: | Considering this from a high-level view of off-the-shelf parts that could be glued together: | ||
Line 86: | Line 127: | ||
* Kuma wiki CMS (Django/Python) | * Kuma wiki CMS (Django/Python) | ||
* Sandboxed server-side JavaScript filter proxy (node.js) | * Sandboxed server-side JavaScript filter proxy (node.js) | ||
* [https://github.com/visionmedia/ejs EJS templates] | |||
* Convenience methods to perform wiki content queries and utility functions | * Convenience methods to perform wiki content queries and utility functions | ||
** Something like what [http://developer.mindtouch.com/en/docs/DekiScript/Reference/Wiki_Functions_and_Variables DekiScript offers for wiki functions and variables] | |||
* Plentiful and intelligent HTTP-based caching | * Plentiful and intelligent HTTP-based caching | ||
=== Prototype === | |||
lorchard has started work on a prototype in node.js: | |||
* https://github.com/lmorchard/kumascript | |||
=== Flow sketch === | |||
Here is a rough sketch of the process and network flows involved in this proposed solution: | |||
[[File:Kumascript.png|1000px|Kuma script flow]] | |||
' | More notes: | ||
* If the caching works right, we should be able to almost never load & compile templates unless they're edited. (Cache invalidation? HEAD requests before GET requests?) | |||
* When a Kuma wiki page is edited, the last-cached KumaScript response is discarded. | |||
* To support preview functionality, we can accept HTTP POST of arbitrary page source at the start of the KumaScript flow. Not cached, which is a feature. | |||
=== Code samples (imaginary) === | === Code samples (imaginary) === | ||
We can limit inline expressions to invoking long-form templates, rather than free-form scripting. Usage still looks familiar with respect to content "in the wild", though:<pre> | |||
free-form scripting. | |||
<li>The <code>value</code> attribute of {{HTMLElement("li")}} now can be | <li>The <code>value</code> attribute of {{HTMLElement("li")}} now can be | ||
negative as specified in HTML5. Previously negative values were converted | negative as specified in HTML5. Previously negative values were converted | ||
Line 104: | Line 156: | ||
</pre> | </pre> | ||
Long-form templates become [https://github.com/visionmedia/ejs embedded JS templates], | Long-form templates become [https://github.com/visionmedia/ejs embedded JS templates], something like this:<pre> | ||
something like this:<pre> | |||
<% | <% | ||
/* accepts as input one required parameter: HTML element to create a xref to */ | /* accepts as input one required parameter: HTML element to create a xref to */ | ||
Line 137: | Line 188: | ||
} else { /* the page doesn't exist */ | } else { /* the page doesn't exist */ | ||
var targeturi = "https://developer.mozilla.org/Article_not_found?uri=" .. dest; | var targeturi = "https://developer.mozilla.org/Article_not_found?uri=" .. dest; | ||
%> <code><a rel="internal" href="<%= targeturi %>" class="new"><%- web.text('<' | %> <code><a rel="internal" href="<%= targeturi %>" class="new"><%- web.text('<' + name + '>') %></a></code> <% | ||
} | } | ||
%> | %> | ||
</pre> | </pre> | ||
The above presupposes an API similar to what's exposed to DekiScript, is | |||
non-functional, and mainly serves as a thought experiment in comparative | |||
syntax. | |||
=== Sandboxed JavaScript execution === | === Sandboxed JavaScript execution === | ||
Line 146: | Line 201: | ||
* Can this be done in a way that restricts file, network, memory, and CPU usage? | * Can this be done in a way that restricts file, network, memory, and CPU usage? | ||
** Anything else dangerous and in need of restriction? | ** Anything else dangerous and in need of restriction? | ||
** How restrictive do we need to be? (ie. is trusting a certain set of authors enough?) | |||
** How restrictive is DekiScript? | |||
* Options inside node.js | * Options inside node.js | ||
** node.js has sandboxing out-of-the-box | ** [http://nodejs.org/docs/v0.3.1/api/vm.html#vm.runInNewContext node.js has sandboxing out-of-the-box] | ||
** There's also [http://gf3.github.com/sandbox/ Sandbox] | ** There's also [http://gf3.github.com/sandbox/ Sandbox] | ||
* Options for host running node.js | * Options for host running node.js | ||
** No filesystem access at all (chroot?) | ** No filesystem access at all (chroot?) | ||
** Whitelisted network access (firewall rules?) | *** and/or [http://lxc.sourceforge.net/ LXC] | ||
** Limited execution time (kill the process after | ** Whitelisted network access (eg. firewall rules? limit base URLs of services?) | ||
** Limited memory usage (kill the process after 10MB consumed?) | ** Limited execution time (eg. kill the process after 30000 msec?) | ||
** Auto-disable script if abuse detected? | ** Limited memory usage (eg. kill the process after 10MB consumed?) | ||
** Auto-disable script if abuse detected (eg. penalty box for X minutes?) |
Latest revision as of 02:59, 17 December 2014
Overview
The mission of the Kuma wiki is to replace MindTouch. One of the advanced features of MindTouch is DekiScript, a Lua-based scripting environment for wiki authors.
We'd like to replace this feature with something secure, capable, modern, and convenient. We'd also like it if we could carry over the existing body of DekiScript with minimal manual changes. Let's see what's feasible.
DekiScript highlights
- Lua-based dynamic scripting language
- XML/HTML literals for easier construction of well-formed markup
- Inline
{ { script expressions } }
in wiki page content - Long-form
Template:
scripts callable with parameters from inline expressions - Local API with access to Wiki data queries (eg. custom tables of contents, etc)
- Network API with access to external web services (eg. RSS feeds, bugzilla, etc)
- L10N-aware constructs and alternate content
- Example: Template:XULAttrInheritedWide
- eg. span lang="en" class="lang lang-en", span lang="de" class="lang lang-de", span lang="*" class="lang lang-*"
- Safe for use by wiki content authors
- (Not entirely sure what makes it safe, need more research)
- Seems like long-form Template: scripts are editable only by elevated-permission users?
- Normal content editors can make calls to Template: scripts, but not employ all scripting features?
- (Not entirely sure what makes it safe, need more research)
Code samples
Here's some inline script excerpted from en/Firefox_9_for_developers:
<li>The <code>value</code> attribute of {{HTMLElement("li")}} now can be negative as specified in HTML5. Previously negative values were converted to 0.</li>
Here's the source of Template:HTMLElement, a commonly used script:
/* accepts as input one required parameter: HTML element to create a xref to */ var uri = uri.parts(Page.uri); var lang = string.tolower(uri.path[0]); if (string.contains(lang, "project") || string.contains(lang, "Project")) { let lang = string.substr(lang, 8); } /* fall back to page.language on a user page */ else if (string.StartsWith(lang, "user:")) { let lang = page.language; } var name = $0; var sectionname = "Element"; if (!string.compare("es", string.tolower(lang))) { let sectionname = "Elemento"; } if (args.title) { let name = args.title; } var dest = lang .. '/' .. 'HTML/' .. sectionname .. '/' .. name; var destEng = 'en/HTML/Element/' .. name; if (wiki.pageExists(dest)) { /* the page exists */ <code>(web.link(wiki.uri(dest), '<' .. name .. '>'))</code>; } else if (lang == 'zh_tw' && wiki.pageExists(destEng)){ /* the MozTW community consider links to English pages better than red ones. I'll write about this to mozilla.dev.mdc later */ <code>(web.link(wiki.uri(destEng), '<' .. name .. '>'))</code>; } else { /* the page doesn't exist */ var targeturi = "https://developer.mozilla.org/Article_not_found?uri=" .. dest; <code><a rel=('internal') href=(targeturi) class=('new')>web.text('<' .. name .. '>')</a></code> }
Miscellaneous notes
- Would like a relatively boring solution that matches core competencies
- Components should have active existing communities
- Try to stay close to dev team core-competencies
- Try to make it sensible to host for care & feeding by IT
- Switch from Lua to JavaScript basis
- JavaScript is more of a webdev core-competency
- JS is pretty close to Lua
- node.js has an active community, works out-of-the-box
- Maybe consider SpiderNode, but only if we get benefits beyond not-invented-here purity
- Or, do something with Lua?
- HTTP proxy service vs embedding JS
- Embedding JS in a Python app seems experimental, and outside webdev team core-competency
- HTTP is a webdev core-compentency, and has loose-coupling and RESTafarian benefits
- Taming a code execution environment in its own process seems safer and more easily scaled
- Taming semi- or untrusted JavaScript code
- Built-in node.js sandboxing features
- Sandbox to wrap JS execution in node.js?
- Seems to have some more features than built-in node.js sandboxing
- timeouts, restricted method access, graceful errors
- Use standalone embedded JS templates instead of literals embedded in script?
- Employs code-in-markup instead of markup-in-code, and avoids the need for fancy compiler hijinx
- This is being used in other projects
- Might js-xml-literals come in handy?
- I failed on my first attempt to install, so haven't seen this in action.
- This seems experimental; no one else really seems to be using it yet
- Should we get a compiler / language expert in on this for opinions?
- This smells like dangerous excitement
- Could we port DekiScript directly? (It's open source, right?)
Proposed solution #1
The basic concept:
- Kuma (Django/Python) serves as the wiki/CMS for both general content and template scripts
- Document views are proxied through a node.js-based service that evaluates and executes inline macros.
- Support inline parameterized macros that invoke template scripts.
- Macros are usable by all content editors
- Macros are neutered function calls, basically, and not full JS expressions
- Macros are handled by a custom parser, and definitely not just
eval()
'ed
- Support template scripts that are evaluated in a node.js sandbox as EJS templates.
- Template scripts could be made editable only by elevated-privilege editors
Considering this from a high-level view of off-the-shelf parts that could be glued together:
- Kuma wiki CMS (Django/Python)
- Sandboxed server-side JavaScript filter proxy (node.js)
- EJS templates
- Convenience methods to perform wiki content queries and utility functions
- Something like what DekiScript offers for wiki functions and variables
- Plentiful and intelligent HTTP-based caching
Prototype
lorchard has started work on a prototype in node.js:
Flow sketch
Here is a rough sketch of the process and network flows involved in this proposed solution:
More notes:
- If the caching works right, we should be able to almost never load & compile templates unless they're edited. (Cache invalidation? HEAD requests before GET requests?)
- When a Kuma wiki page is edited, the last-cached KumaScript response is discarded.
- To support preview functionality, we can accept HTTP POST of arbitrary page source at the start of the KumaScript flow. Not cached, which is a feature.
Code samples (imaginary)
We can limit inline expressions to invoking long-form templates, rather than free-form scripting. Usage still looks familiar with respect to content "in the wild", though:
<li>The <code>value</code> attribute of {{HTMLElement("li")}} now can be negative as specified in HTML5. Previously negative values were converted to 0.</li>
Long-form templates become embedded JS templates, something like this:
<% /* accepts as input one required parameter: HTML element to create a xref to */ var uri = uri.parts(Page.uri); var lang = string.tolower(uri.path[0]); if (string.contains(lang, "project") || string.contains(lang, "Project")) { let lang = string.substr(lang, 8); } /* fall back to page.language on a user page */ else if (string.StartsWith(lang, "user:")) { let lang = page.language; } var name = arguments[0]; var sectionname = "Element"; if (!string.compare("es", string.tolower(lang))) { sectionname = "Elemento"; } if (args.title) { name = args.title; } var dest = lang + '/' + 'HTML/' + sectionname + '/' + name; var destEng = 'en/HTML/Element/' + name; if (wiki.pageExists(dest)) { /* the page exists */ %> <code><%- web.link(wiki.uri(dest), '<' + name + '>')) %></code> <% } else if (lang == 'zh_tw' && wiki.pageExists(destEng)){ /* the MozTW community consider links to English pages better than red ones. I'll write about this to mozilla.dev.mdc later */ %> <code><%- web.link(wiki.uri(destEng), '<' + name + '>')) %></code> <%; } else { /* the page doesn't exist */ var targeturi = "https://developer.mozilla.org/Article_not_found?uri=" .. dest; %> <code><a rel="internal" href="<%= targeturi %>" class="new"><%- web.text('<' + name + '>') %></a></code> <% } %>
The above presupposes an API similar to what's exposed to DekiScript, is non-functional, and mainly serves as a thought experiment in comparative syntax.
Sandboxed JavaScript execution
- Can this be done in a way that restricts file, network, memory, and CPU usage?
- Anything else dangerous and in need of restriction?
- How restrictive do we need to be? (ie. is trusting a certain set of authors enough?)
- How restrictive is DekiScript?
- Options inside node.js
- node.js has sandboxing out-of-the-box
- There's also Sandbox
- Options for host running node.js
- No filesystem access at all (chroot?)
- and/or LXC
- Whitelisted network access (eg. firewall rules? limit base URLs of services?)
- Limited execution time (eg. kill the process after 30000 msec?)
- Limited memory usage (eg. kill the process after 10MB consumed?)
- Auto-disable script if abuse detected (eg. penalty box for X minutes?)
- No filesystem access at all (chroot?)