Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Wikitech

On templates and programming languages

 

 

First page Previous page 1 2 3 4 5 Next page Last page  View All Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded


Simetrical+wikilist at gmail

Jun 30, 2009, 6:33 PM

Post #51 of 116 (825 views)
Permalink
Re: On templates and programming languages [In reply to]

On Tue, Jun 30, 2009 at 6:08 PM, Robert Rohde<rarohde[at]gmail.com> wrote:
> In addition to resource limits, any scheme better make sure what's
> passed into the programming language and what's passed out makes
> sense.  For example, you shouldn't have it generating raw HTML and
> probably shouldn't let it mess with strip markers.  Some of this may
> be automatic depending how it's integrated into the parser.  One would
> probably also want to limit the size of an allowed output (e.g. don't
> let it send 5 MB to the user).  Depending on the integration there may
> be other control sequences that one needs to catch when it returns as
> well.

I was assuming it would just return wikitext, and that would be
integrated into the page and parsed, following all limits on wikitext
(including size) -- just as with current parser functions.

> On a separate point, one of the limitations of stand-alone type
> sandboxes is that it would make it harder for the code to call other
> template pages.  One of the few virtues of the current template code
> is that it is relatively modular, with more complex templates being
> built out of less complex ones.  If this programming language is meant
> to replace that then it would also need to be able to reference the
> results of other template pages.  One solution is to pre-expand those
> sections (similar to what is done now, I believe), but that can get
> rather delicate once one has programming constructs like variable
> assignments, looping, and recursion since the template parameters
> won't necessarily be fixed at the Preprocessor stage.

I'd assume we'd support some kind of includes. One rudimentary way to
do it would be to run Lua stuff after or during preprocessing, so you
could just include Lua code macro-style using templates. A better way
would probably be to support the include features of the language
itself (I don't know how they work offhand, for Lua).

On Tue, Jun 30, 2009 at 6:12 PM, Jared
Williams<jared.williams1[at]ntlworld.com> wrote:
> Yeah, would also need time & mem use restrictions.

Which is impossible for in-process use. You'd have to shell out if
you do that, which defeats the entire point of using PHP instead of
something else to begin with.

On Tue, Jun 30, 2009 at 7:16 PM, Andrew Garrett<agarrett[at]wikimedia.org> wrote:
> That's just scary. We'd definitely want to do the validation as close
> as possible to the actual eval()ing, to minimise backdoors like
> Special:Import et al.

You'd be saving the code to a file on disk somewhere, probably named
using a hash of the input. The only thing saving the code would be
the code that sanitizes it. There's no way anything could go wrong
unless an attacker gains filesystem write access, in which case you're
hosed anyway. Parsing PHP on every page view when you could cache it
in APC is crazy.

On Tue, Jun 30, 2009 at 7:24 PM, Hay (Husky)<huskyr[at]gmail.com> wrote:
> That leaves us to Lua and Javascript, which are both small and
> efficient languages meant to solve tasks like this. Remember, i'm
> talking about 'core' Javascript here, not with all DOM methods and
> stuff. If you strip that all out (take a look at the 1.5. core
> reference at Mozilla.com:
> https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference) you
> get a pretty nice and simple language that isn't very large. Both
> would require a new parser and/or installed compilers on the
> server-side. Compared to the disadvantages of other options, that
> seems like a pretty small loss for a great win.

Reasonable enough, yeah. Sandboxing might easier too. What are some
standalone JavaScript interpreters we could use? Ideally we'd use a
heavily-optimized JIT compiler, like V8 or TraceMonkey, but I don't
know if those work standalone.

On Tue, Jun 30, 2009 at 8:33 PM, Brion Vibber<brion[at]wikimedia.org> wrote:
> That's why we want to fix it! :)
>
> It *should* be fairly trivial to fetch a template/plugin sort of thing
> off of one wiki and put it on another. Consider this as one of our goals
> for next-gen templating.

Eh. Then that really ties our hands. If we have to have support for
shared hosts without exec() support, then I don't see any viable
option except sanitized PHP.

On Tue, Jun 30, 2009 at 8:37 PM, Brion Vibber<brion[at]wikimedia.org> wrote:
> Executing PHP from apache-writable files saved on disk is also a
> security danger.
>
> The original implementation of the MonoBook skin used the TAL templating
> language, which was compiled into executable PHP at runtime and stored
> in /tmp so it could be cached for the next view.
>
> In addition to difficulties with hosts which had misconfigured /tmp
> directories, we found that people sharing their hosts with
> poorly-secured WordPress installations would end up finding their wikis
> hacked -- worms exploiting vulnerabilities in other PHP apps would hop
> around the system modifying any .php files they could write to...
> including the cached PHPTAL templates.

It could be eval()ed by default, but the performance wins from using
APC would surely be huge. If you set it up carefully it should be
safe enough.

On Tue, Jun 30, 2009 at 8:41 PM, Brian<Brian.Mingus[at]colorado.edu> wrote:
> There is nothing in the OP that indicates that we are keeping the
> current template code or even that it would be desirable. Whatever
> facilities the language we choose has for including other files and
> passing arguments to functions is 100% sufficient.

We're talking about changing how templates are written, not how
they're called. Changing the template call syntax is an entirely
different discussion that's orthogonal to this one.

On Tue, Jun 30, 2009 at 9:02 PM, Trevor Parscal<tparscal[at]wikimedia.org> wrote:
> Seems like JSON syntax is pretty simple and could be a big improvement
> to how templates are currently invoked.

I'm not sure where you'd use JSON here?

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


michael.daly at kayakwiki

Jun 30, 2009, 7:11 PM

Post #52 of 116 (823 views)
Permalink
Re: On templates and programming languages [In reply to]

Thomas Dalton wrote:
>> Thus the first line of a template would
>> be the example of its use:
>>
>> Template:foobar
>> ----------------------------------------------------------------------
>> {{Foobar|$var1|$var2|$andAnotherVar}}
>> ...(implementation)...
>> ----------------------------------------------------------------------
>
> How does that work with anonymous variables? Are all $[NUMBER] style
> names count as auto-declared?

Template:foobar
----------------------------------------------------------------------
{{Foobar|$1|$2|$3}}
...(implementation)...
----------------------------------------------------------------------

That would make $4 a bit of text. Exactly the same kind of "template
prototype" - to borrow C's terminology. I see no reason to have
multiple different ways of identifying a variable, nor any reason to
have defaults or automatic declarations.

This of course would forbid the use of $n (where n = 1, 2, 3...) as a
synonym for a named variable). That is permitted isn't it? I can't
remember.

Mike


_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


michael.daly at kayakwiki

Jun 30, 2009, 7:27 PM

Post #53 of 116 (823 views)
Permalink
Re: On templates and programming languages [In reply to]

Chad wrote:

> Unless we plan on trying to mass-convert not only years of old revisions
> but change years-old behavior that millions of users have come to expect?
> I would expect _any_ change to keep {{sometemplate}} always working,
> even if the mechanics behind it change.

Why not switch the template syntax for articles to match the syntax for
tags (which in turn is based on XML or whatever syntax that comes from
ultimately)?

{{sometemplate|var1=foo|var2=bar}}

becomes

<sometemplate>var1=foo; var2=bar;</sometamplate>

or:

<sometemplate var1="foo" var2="bar"/>

That means that the tag namespace and the Template namespace (where
namespace is more generic than just the concept of MW namespaces) will
potentially clash. This could be handled with something like:

<template name="sometemplate">
var1=foo;
var2=bar;
</sometamplate>

or:

<template name="sometemplate" var1="foo" var2="bar"/>

which is a tad more verbose but more explicit.

Mike


_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


rarohde at gmail

Jun 30, 2009, 7:45 PM

Post #54 of 116 (824 views)
Permalink
Re: On templates and programming languages [In reply to]

On Tue, Jun 30, 2009 at 7:27 PM, Michael Daly<michael.daly[at]kayakwiki.org> wrote:
> Chad wrote:
>
>> Unless we plan on trying to mass-convert not only years of old revisions
>> but change years-old behavior that millions of users have come to expect?
>> I would expect _any_ change to keep {{sometemplate}} always working,
>> even if the mechanics behind it change.
>
> Why not switch the template syntax for articles to match the syntax for
> tags (which in turn is based on XML or whatever syntax that comes from
> ultimately)?
>
> {{sometemplate|var1=foo|var2=bar}}
>
> becomes
>
> <sometemplate>var1=foo; var2=bar;</sometamplate>
>
> or:
>
> <sometemplate var1="foo" var2="bar"/>
>
> That means that the tag namespace and the Template namespace (where
> namespace is more generic than just the concept of MW namespaces) will
> potentially clash.  This could be handled with something like:
>
> <template name="sometemplate">
> var1=foo;
> var2=bar;
> </sometamplate>
>
> or:
>
> <template name="sometemplate" var1="foo" var2="bar"/>
>
> which is a tad more verbose but more explicit.

Makes it awfully ugly to pass the result of one template to another
template if your syntax is:

<template name="sometemplate" var1="<template name="birthday" val="May
24" />" var2="bar"/>

-Robert Rohde

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


sergey.chernyshev at gmail

Jun 30, 2009, 7:45 PM

Post #55 of 116 (825 views)
Permalink
Re: On templates and programming languages [In reply to]

I don't know about scripting languages for the templating, it might be an
overkill.

When I was picking lower language for MediaWiki Widgets extension, I looked
at popular PHP templating systems and ended up picking Smarty (
http://smarty.net/) - it can be security locked, it has a few useful
features.

You can see Widget code here:
http://www.mediawikiwidgets.org/w/index.php?title=Widget:Google_Calendar&action=editand
widget is called using a parser function like this: {{widget:
Name|param=val|param2=val2}}.

Double curlys are far from perfect, but there are not that many good
alternatives - XML is probably the only good alternative because it's
universal and use by many-many tools out there. Can't say that I'm an expert
in templating languages though, especially when we're talking about
power-users and not developers.

Thank you,

Sergey


--
Sergey Chernyshev
http://www.sergeychernyshev.com/


On Tue, Jun 30, 2009 at 12:16 PM, Brion Vibber <brion[at]wikimedia.org> wrote:

> As many folks have noted, our current templating system works ok for
> simple things, but doesn't scale well -- even moderately complex
> conditionals or text-munging will quickly turn your template source into
> what appears to be line noise.
>
> And we all thought Perl was bad! ;)
>
> There's been talk of Lua as an embedded templating language for a while,
> and there's even an extension implementation.
>
> One advantage of Lua over other languages is that its implementation is
> optimized for use as an embedded language, and it looks kind of pretty.
>
> An _inherent_ disadvantage is that it's a fairly rarely-used language,
> so still requires special learning on potential template programmers' part.
>
> An _implementation_ disadvantage is that it currently is dependent on an
> external Lua binary installation -- something that probably won't be
> present on third-party installs, meaning Lua templates couldn't be
> easily copied to non-Wikimedia wikis.
>
>
> There are perhaps three primary alternative contenders that don't
> involve making up our own scripting language (something I'd dearly like
> to avoid):
>
> * PHP
>
> Advantage: Lots of webbish people have some experience with PHP or can
> easily find references.
>
> Advantage: we're pretty much guaranteed to have a PHP interpreter
> available. :)
>
> Disadvantage: PHP is difficult to lock down for secure execution.
>
>
> * JavaScript
>
> Advantage: Even more folks have been exposed to JavaScript programming,
> including Wikipedia power-users.
>
> Disadvantage: Server-side interpreter not guaranteed to be present. Like
> Lua, would either restrict our portability or would require an
> interpreter reimplementation. :P
>
>
> * Python
>
> Advantage: A Python interpreter will be present on most web servers,
> though not necessarily all. (Windows-based servers especially.)
>
> Wash: Python is probably better known than Lua, but not as well as PHP
> or JS.
>
> Disadvantage: Like PHP, Python is difficult to lock down securely.
>
>
> Any thoughts? Does anybody happen to have a PHP implementation of a Lua
> or JavaScript interpreter? ;)
>
> -- brion
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l[at]lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Jun 30, 2009, 8:46 PM

Post #56 of 116 (824 views)
Permalink
Re: On templates and programming languages [In reply to]

On Tue, Jun 30, 2009 at 10:45 PM, Sergey
Chernyshev<sergey.chernyshev[at]gmail.com> wrote:
> I don't know about scripting languages for the templating, it might be an
> overkill.

People are using ParserFunctions as a scripting language already.
That's not feasibly going to be removed at this point. So the only
way to go is to replace it with a better scripting language, which is
what we're talking about.

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


tstarling at wikimedia

Jun 30, 2009, 8:46 PM

Post #57 of 116 (823 views)
Permalink
Re: On templates and programming languages [In reply to]

Brion Vibber wrote:
> There's been talk of Lua as an embedded templating language for a while,
> and there's even an extension implementation.
>
> One advantage of Lua over other languages is that its implementation is
> optimized for use as an embedded language, and it looks kind of pretty.
>
> An _inherent_ disadvantage is that it's a fairly rarely-used language,
> so still requires special learning on potential template programmers' part.
>
> An _implementation_ disadvantage is that it currently is dependent on an
> external Lua binary installation -- something that probably won't be
> present on third-party installs, meaning Lua templates couldn't be
> easily copied to non-Wikimedia wikis.

There are problems with all the shell-based solutions. MediaWiki
callbacks, like template expansion, {{VARIABLES}} and ifexist, are
commonly used in templates on Wikipedia, and a scripting language
without these would suffer from poor community buy-in. You could
implement them from the shell using IPC, but IPC in PHP is rather
cumbersome. The interface between the parser and the scripting engine
would be performance-sensitive, because users would write templates
that invoked the scripting engine hundreds of times in the course of
rendering an article. So there's a case there for a persistent
scripting engine with a command-based interface over a pipe.

The reason I like Lua is because of the potential to embed it in PHP
as an extension, with fast setup and fast callbacks to MediaWiki. It
does all its memory allocation via a callback to the application,
including VM stack space, which means that it's possible to control
the memory usage without killing the process when the limit is
exceeded. But its standard library is unsuitable for running untrusted
scripts, since it contains all the usual process control and file
read/write functions.

The current PECL extension doesn't have any of the features that make
Lua attractive: it does not have support for callbacks to PHP, or for
replacing the standard library with something more sensible, or for
limiting memory without killing the request when the limit is
exceeded. Obviously the distributed standalone does not have these
features either.

I had imagined the task of embedding Lua in MediaWiki as being
primarily a C project, writing the necessary glue code between the
embedded interpreter and PHP. I had hoped that banging the drum for
Lua might encourage someone to look at these issues and start work on
that project.


> * PHP
>
> Advantage: Lots of webbish people have some experience with PHP or can
> easily find references.
>
> Advantage: we're pretty much guaranteed to have a PHP interpreter
> available. :)
>
> Disadvantage: PHP is difficult to lock down for secure execution.

PHP can be secured against arbitrary execution using token_get_all(),
there's a proof-of-principle validator of this kind in the master
switch script project. But there are problems with attempting a
single-process PHP-in-PHP sandbox:

* The poor support for signals in PHP makes it difficult to limit the
execution time of a script snippet. Ticks only occur at the end of
each statement, so you can defeat them by making a single statement
that runs forever.

* Apart from blacklisting function definition, there is no way to
protect against infinite recursion, which exhausts the process stack
and causes a segfault.

* Memory limits are implemented on a per-request basis, and there's no
way to recover from exceeding the memory limit, the request is just
killed.

> * JavaScript
>
> Advantage: Even more folks have been exposed to JavaScript programming,
> including Wikipedia power-users.
>
> Disadvantage: Server-side interpreter not guaranteed to be present. Like
> Lua, would either restrict our portability or would require an
> interpreter reimplementation. :P
>
>
> * Python
>
> Advantage: A Python interpreter will be present on most web servers,
> though not necessarily all. (Windows-based servers especially.)
>
> Wash: Python is probably better known than Lua, but not as well as PHP
> or JS.
>
> Disadvantage: Like PHP, Python is difficult to lock down securely.
>
>
> Any thoughts? Does anybody happen to have a PHP implementation of a Lua
> or JavaScript interpreter? ;)

SpiderMonkey and Python both lack control over memory usage. Python
lacks a sandbox mode, the rexec module has been removed. SpiderMonkey
isn't embedded in any useful kind of standalone, so you'd have to
start with a C development project, like you would for Lua.

I think Rhino would be an easier path to JavaScript execution than
SpiderMonkey. You can pass an -Xmx option to the java VM, and it'll
throw an OutOfMemory exception when it hits that limit, allowing you
to implement per-snippet memory limits without killing the
interpreter. You could do wall-clock time limits using
java.util.Timer, or CPU time limits using a JNI hack to poll clock().
You could turn off LiveConnect by making your own ClassShutter,
leaving what (on initial impressions) is a reasonably secure sandbox.
You'd still need an interface between Java and PHP, but presumably
that's a well-studied problem.

Running scripts in the Java VM has the advantage that you don't have
to rely on the security of the collection of amateurish C code that is
PHP. Remember those PCRE crash bugs that went unfixed for years,
before someone finally demonstrated elevation to arbitrary execution?
At a conference, I overheard Rasmus Lerdorf quip that really PHP is
pretty secure, since most of the demonstrated buffer/integer/heap
overflows needed arbitrary script access to exploit, and if the
attacker has that then you're screwed anyway.

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


michael.daly at kayakwiki

Jun 30, 2009, 8:53 PM

Post #58 of 116 (824 views)
Permalink
Re: On templates and programming languages [In reply to]

Robert Rohde wrote:

> Makes it awfully ugly to pass the result of one template to another
> template if your syntax is:
>
> <template name="sometemplate" var1="<template name="birthday" val="May
> 24" />" var2="bar"/>

Eww! - hadn't thought of that one. Back to the other style:

<template name="sometemplate">
var1=<template name="birthday"> val="May 24";</template>;
var2=bar;
</template>

or unnamed:

<template name="sometemplate">
<template name="birthday"> val="May 24";</template>;
bar;
</template>

Recursive template processing should be default. Obviously, this is a
work in progress...

Mike


_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


thomas.dalton at gmail

Jun 30, 2009, 9:18 PM

Post #59 of 116 (825 views)
Permalink
Re: On templates and programming languages [In reply to]

2009/7/1 Michael Daly <michael.daly[at]kayakwiki.org>:
> Why not switch the template syntax for articles to match the syntax for
> tags (which in turn is based on XML or whatever syntax that comes from
> ultimately)?

What is wrong with the current syntax for calling templates? At least,
what is wrong with it that would be improved by that change?

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


gmaxwell at gmail

Jun 30, 2009, 9:26 PM

Post #60 of 116 (824 views)
Permalink
Re: On templates and programming languages [In reply to]

On Tue, Jun 30, 2009 at 12:16 PM, Brion Vibber<brion[at]wikimedia.org> wrote:
> As many folks have noted, our current templating system works ok for
> simple things, but doesn't scale well -- even moderately complex
> conditionals or text-munging will quickly turn your template source into
> what appears to be line noise.
>
> And we all thought Perl was bad! ;)
>
> There's been talk of Lua as an embedded templating language for a while,
> and there's even an extension implementation.
>
> One advantage of Lua over other languages is that its implementation is
> optimized for use as an embedded language, and it looks kind of pretty.
[snip]

So— Any thoughts on how you address the universal problem of the DOS
attack script?

I.e.
myscript:
do {
some_expensive_operation(); /* Presumably there will be hooks to pull
text from other revisions */
} while (1);

and in [[Template:Widely used]]
{{myscript}}


I'm of the impression that simply setting a limits on CPU and memory
isn't sufficient to address this, because the reasonable limit will be
high enough to be dangerous when the object is added to 100k pages,
while a limit low enough to be safe everywhere will be far too
constraining and likely to fail at random depending on overall system
load.


> Disadvantage: Like PHP, Python is difficult to lock down securely.

I don't know that difficult is really the right description here.
People willing to spend far more effort on this than you probably are
have tried to sandbox python and failed. I don't believe there is any
real production grade support for the level of lockdown required for
either PHP or Python. And I'd worry that any PHP implementations of
the sandboxed languages might lose the battle tested sandboxing.

It's acceptable for mediawiki to fall back to lower performing
alternatives when c modules can't be used, but I doubt its acceptable
to fall back to less secure ones!

Is execution in enviroments where c modules are not possible actually
a hard requirement? If it is I think this is a non-starter.

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


gmaxwell at gmail

Jun 30, 2009, 9:46 PM

Post #61 of 116 (825 views)
Permalink
Re: On templates and programming languages [In reply to]

On Tue, Jun 30, 2009 at 11:46 PM, Tim Starling<tstarling[at]wikimedia.org> wrote:
[snip]
> SpiderMonkey and Python both lack control over memory usage. Python
> lacks a sandbox mode, the rexec module has been removed. SpiderMonkey
> isn't embedded in any useful kind of standalone, so you'd have to
> start with a C development project, like you would for Lua.

Cpython has about a billion ways to inject machine code, this is one
reason why Rpython failed.
If you were to do python it would probably need to be embedded in java.

For spidermonkey the model I would have envisioned is a separate
script executor daemon which spawns thread-per-script (with limits to
keep the peak thread count reasonable) and arbitrates communication
with mediawiki over sockets. Memory limits then become a simple
exercise in providing an instrumented malloc and setting the thread
stack size appropriately.

This model has the advantage for big installations that script
processing can be compartmentalized and run only on certan systems or
only on certain cores. It would also allow the scripting process to be
more highly compartmentalized than PHP is, since its would only need
to be able to SBRK and read/write some sockets. (i.e
http://en.wikipedia.org/wiki/Seccomp )

Another reason why using a narrow pipe interface is that it would be
possible to distinguish scripts which are a proper function on their
inputs from ones that aren't, and a narrow pipe interface makes it
easier to enforce those limits:

For example, there could be three script modes:
Function
Function+Date
Not-function

Functions are guaranteed to produce constant output for their input,
and their input can't include anything which is more volatile than
page editing. (i.e. no time/date as an input, no time/pid triggered
rand(), no retrieving data from logs or other pages). The output from
these could be trivially cached based on a hash of the input
arguments.

Function+date is like the above, but they also have access to the
current date (but not time). These could be cached but the cache would
be invalidated every day. This could be generalized further where the
script prototype could specify the available inputs. (i.e. is this a
function on page specific data, or is this just some formatting
template which works universally?)

Not-function means without those limits.


The different types of script could have resource limits, execution
priorities, and site policy controls. For example, wikimedia might
only allow function, function+revision_info for performance reasons.

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


questpc at rambler

Jun 30, 2009, 10:42 PM

Post #62 of 116 (825 views)
Permalink
Re: On templates and programming languages [In reply to]

> 1 - XSLT
>
> Since the syntax is XML (like the extensions tags) and XPath
(vaguely
> similar to template syntax, although it's XML that calls XPath, the
> opposite of what we have) It would be reasonably consistent with
> current
> syntax. It also should also already be fairly well locked down, and
> the
> interface seems fairly clear - present template parameters as
> stylesheet
> parameters, and other magic words as an input document. We may just
> need
> a few simplifications to make it easier to use.
>
XSLT itself is a way too much locked down - even simple things like
substrings manipulation and loops aren't so easy to perform. Well, maybe
I am too stupid for XSLT but from my experience bringing tag syntax in
programming language make the code poorly readable and bloated. I've
used XSLT for just one of my projects.

> 2- lisp/scheme
>
> Should be easy to write a parser for if needed, since the grammer is
so
> simple,
> and it should be relatively simple to lock down or extend as needed.
>
Deeply nested braces of lisp remind me of current MediaWiki parser.

> Of course, those are both a bit more esoteric than your
recommendations.
> Perl is nice for getting useful results from short code, if we're not
> bothered by one parser with no grammer specification calling another
> one. Tcl may
> be a reasonable compromise; a less esoteric, imperative language which
> is often
> used as an extension language.
>
Lua was highly valued here at computer lab, also Ocaml (not sure of
proper spelling).
Dmitriy

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


gmaxwell at gmail

Jun 30, 2009, 11:17 PM

Post #63 of 116 (826 views)
Permalink
Re: On templates and programming languages [In reply to]

On Wed, Jul 1, 2009 at 1:42 AM, Dmitriy Sintsov<questpc[at]rambler.ru> wrote:
> XSLT itself is a way too much locked down - even simple things like
> substrings manipulation and loops aren't so easy to perform. Well, maybe
> I am too stupid for XSLT but from my experience bringing tag syntax in
> programming language make the code poorly readable and bloated. I've
> used XSLT for just one of my projects.

Juniper Networks (my day job) uses XSLT as the primary scripting
language on their routing devices, and chose to do so primarily
because of sandboxing and the ease of XML tree manipulation with xpath
(JunOS configuration has a complete and comprehensive XML
representation). To facilitate that usage we defined an alternative
syntax for XSLT called SLAX (http://code.google.com/p/libslax/),
though it hasn't seen widespread adoption outside of Juniper yet.
(Slax can be mechanically converted to XSLT and vice versa)

SLAX pretty much resolves your readability concern. Although there are
the conceptual barriers for people coming from procedural languages to
any strongly functional programming language still remain.

You don't loop in XSLT, you recurse or iterate over a structure (i.e.
map/reduce).

I've grown rather fond of XSLT but wouldn't personally recommend it
for this application. It lacks the high speed bytecoded execution
environments available for other languages, snf I don't see many
scripts on the site doing extensive document tree manipulation (it's
hard for me to express how awesome xpath is at that)... and I would
also guess that there are probably more adept mediawiki template
language coders today than there are people who are really fluent in
XSLT.

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


questpc at rambler

Jun 30, 2009, 11:40 PM

Post #64 of 116 (824 views)
Permalink
Re: On templates and programming languages [In reply to]

* Gregory Maxwell <gmaxwell[at]gmail.com> [Wed, 1 Jul 2009 02:17:24 -0400]:
>
> Juniper Networks (my day job) uses XSLT as the primary scripting
> language on their routing devices, and chose to do so primarily
> because of sandboxing and the ease of XML tree manipulation with xpath
> (JunOS configuration has a complete and comprehensive XML
> representation). To facilitate that usage we defined an alternative
> syntax for XSLT called SLAX (http://code.google.com/p/libslax/),
> though it hasn't seen widespread adoption outside of Juniper yet.
> (Slax can be mechanically converted to XSLT and vice versa)
>
> SLAX pretty much resolves your readability concern. Although there are
> the conceptual barriers for people coming from procedural languages to
> any strongly functional programming language still remain.
>
Try submitting it as standard? It probably should make XSLT more
popular.

> You don't loop in XSLT, you recurse or iterate over a structure (i.e.
> map/reduce).
>
Yes, I've realised that. I've done enough of recursion (you can also
program in functional style using procedural languages), but the problem
is, that it enforces the recursion where it's not really required.
Anyway that's offtopic.

> I've grown rather fond of XSLT but wouldn't personally recommend it
> for this application. It lacks the high speed bytecoded execution
> environments available for other languages, snf I don't see many
> scripts on the site doing extensive document tree manipulation (it's
> hard for me to express how awesome xpath is at that)... and I would
> also guess that there are probably more adept mediawiki template
> language coders today than there are people who are really fluent in
> XSLT.
>
Ok.
Dmitriy

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


william.allen.simpson at gmail

Jul 1, 2009, 12:50 AM

Post #65 of 116 (825 views)
Permalink
Re: On templates and programming languages [In reply to]

Haven't read the entire thread yet, so hopefully nobody has said this:

Perl, write-once, poor choice for uncontrolled environment.

Lisp, at least the computer science type will know. Haven't used it
myself since early '80s.

Lua, don't know whether it's improved in the past few years, but freeciv
had serious problems with migrating to 5.1. Personally, I've given up on
it, but my 14 y-o nephew seems to like it for various game modification.

Javascript, OMG don't go there.

Everybody seems to be going the python direction lately, but I've only
minimal experience with it, so cannot make a recommendation.

I'd worry less about providing extensive functionality (we certainly
don't have much now, so anything more would be gravy), but rather
ease of integration, scalability, and security.

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


gmaxwell at gmail

Jul 1, 2009, 1:35 AM

Post #66 of 116 (823 views)
Permalink
Re: On templates and programming languages [In reply to]

On Wed, Jul 1, 2009 at 3:50 AM, William Allen
Simpson<william.allen.simpson[at]gmail.com> wrote:
> Javascript, OMG don't go there.

Don't be so quick to dismiss Javscript. If we were making a scorecard
it would likely meet most of the checkboxes:

* Available of reliable battle tested sandboxes (and probably the only
option discussed other than x-in-JVM meeting this criteria)
* Availability of fast execution engines
* Widely known by the existing technical userbase (JS beats the
other options hands down here)
* Already used by many Mediawiki developers
* Doesn't inflate the number of languages used in the operation of the site
* Possibility of reuse between server-executed and client-executed
(Only JS of the named options meets this criteria)
* Can easily write clear and readable code
* Modern high level language features (dynamic arrays, hash tables, etc)

There may exist great reasons why another language is a better choice,
but JS is far from the first thing that should be eliminated.

Python is a fine language but it fails all the criteria I listed above
except the last two.

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


huskyr at gmail

Jul 1, 2009, 1:44 AM

Post #67 of 116 (823 views)
Permalink
Re: On templates and programming languages [In reply to]

Javascript might have gotten a bad name in the past because of 14-year
olds who used it to display 'Welcome to my website!' alerts on their
Geocities homepage, but it's really unfair. Javascript is a very
flexible and dynamic language that can be written very elegantly.

I urge everyone who still think Javascript is a toy language to read
Douglas Crockford's excellent article:

http://javascript.crockford.com/javascript.html

-- Hay

On Wed, Jul 1, 2009 at 10:35 AM, Gregory Maxwell<gmaxwell[at]gmail.com> wrote:
> On Wed, Jul 1, 2009 at 3:50 AM, William Allen
> Simpson<william.allen.simpson[at]gmail.com> wrote:
>> Javascript, OMG don't go there.
>
> Don't be so quick to dismiss Javscript.  If we were making a scorecard
> it would likely meet most of the checkboxes:
>
> * Available of reliable battle tested sandboxes (and probably the only
> option discussed other than x-in-JVM meeting this criteria)
> * Availability of fast execution engines
> * Widely known by the existing technical userbase   (JS beats the
> other options hands down here)
> * Already used by many Mediawiki developers
> * Doesn't inflate the number of languages used in the operation of the site
> * Possibility of reuse between server-executed and client-executed
> (Only JS of the named options meets this criteria)
> * Can easily write clear and readable code
> * Modern high level language features (dynamic arrays, hash tables, etc)
>
> There may exist great reasons why another language is a better choice,
> but JS is far from the first thing that should be eliminated.
>
> Python is a fine language but it fails all the criteria I listed above
> except the last two.
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l[at]lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


jared.williams1 at ntlworld

Jul 1, 2009, 4:17 AM

Post #68 of 116 (812 views)
Permalink
Re: On templates and programming languages [In reply to]

> -----Original Message-----
> From: wikitech-l-bounces[at]lists.wikimedia.org
> [mailto:wikitech-l-bounces[at]lists.wikimedia.org] On Behalf Of
> Brion Vibber
> Sent: 30 June 2009 17:17
> To: Wikimedia developers
> Subject: [Wikitech-l] On templates and programming languages
>
> As many folks have noted, our current templating system works
> ok for simple things, but doesn't scale well -- even
> moderately complex conditionals or text-munging will quickly
> turn your template source into what appears to be line noise.
>
> And we all thought Perl was bad! ;)
>
> There's been talk of Lua as an embedded templating language
> for a while, and there's even an extension implementation.
>
> One advantage of Lua over other languages is that its
> implementation is optimized for use as an embedded language,
> and it looks kind of pretty.
>
> An _inherent_ disadvantage is that it's a fairly rarely-used
> language, so still requires special learning on potential
> template programmers' part.
>
> An _implementation_ disadvantage is that it currently is
> dependent on an external Lua binary installation -- something
> that probably won't be present on third-party installs,
> meaning Lua templates couldn't be easily copied to
> non-Wikimedia wikis.
>
>
> There are perhaps three primary alternative contenders that
> don't involve making up our own scripting language (something
> I'd dearly like to avoid):
>
> * PHP
>
> Advantage: Lots of webbish people have some experience with
> PHP or can easily find references.
>
> Advantage: we're pretty much guaranteed to have a PHP
> interpreter available. :)
>
> Disadvantage: PHP is difficult to lock down for secure execution.
>
>
> * JavaScript
>
> Advantage: Even more folks have been exposed to JavaScript
> programming, including Wikipedia power-users.
>
> Disadvantage: Server-side interpreter not guaranteed to be
> present. Like Lua, would either restrict our portability or
> would require an interpreter reimplementation. :P
>
>
> * Python
>
> Advantage: A Python interpreter will be present on most web
> servers, though not necessarily all. (Windows-based servers
> especially.)
>
> Wash: Python is probably better known than Lua, but not as
> well as PHP or JS.
>
> Disadvantage: Like PHP, Python is difficult to lock down securely.
>
>
> Any thoughts? Does anybody happen to have a PHP
> implementation of a Lua or JavaScript interpreter? ;)
>

Would you want the interpreter to translate the template into PHP
array of opcodes first, so could dump that into APC/MemCache?

Jared


_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


tparscal at wikimedia

Jul 1, 2009, 8:03 AM

Post #69 of 116 (807 views)
Permalink
Re: On templates and programming languages [In reply to]

I'm glad to see I'm not alone. JavaScript can indeed invoke bad
memories of fragile scripts running in IE5 which are long and awkward
due to limitations in browser technology at the time. However, anyone
who has used a modern library like jQuery on a support browser will
tell you it's very powerful and intuitive while being simple,
straightforward and actually fun. Any language capable of supporting
this experience is worth seriously considering as an option for us.

- Trevor

Sent from my iPod

On Jul 1, 2009, at 1:44 AM, "Hay (Husky)" <huskyr[at]gmail.com> wrote:

> Javascript might have gotten a bad name in the past because of 14-year
> olds who used it to display 'Welcome to my website!' alerts on their
> Geocities homepage, but it's really unfair. Javascript is a very
> flexible and dynamic language that can be written very elegantly.
>
> I urge everyone who still think Javascript is a toy language to read
> Douglas Crockford's excellent article:
>
> http://javascript.crockford.com/javascript.html
>
> -- Hay
>
> On Wed, Jul 1, 2009 at 10:35 AM, Gregory Maxwell<gmaxwell[at]gmail.com>
> wrote:
>> On Wed, Jul 1, 2009 at 3:50 AM, William Allen
>> Simpson<william.allen.simpson[at]gmail.com> wrote:
>>> Javascript, OMG don't go there.
>>
>> Don't be so quick to dismiss Javscript. If we were making a
>> scorecard
>> it would likely meet most of the checkboxes:
>>
>> * Available of reliable battle tested sandboxes (and probably the
>> only
>> option discussed other than x-in-JVM meeting this criteria)
>> * Availability of fast execution engines
>> * Widely known by the existing technical userbase (JS beats the
>> other options hands down here)
>> * Already used by many Mediawiki developers
>> * Doesn't inflate the number of languages used in the operation of
>> the site
>> * Possibility of reuse between server-executed and client-executed
>> (Only JS of the named options meets this criteria)
>> * Can easily write clear and readable code
>> * Modern high level language features (dynamic arrays, hash tables,
>> etc)
>>
>> There may exist great reasons why another language is a better
>> choice,
>> but JS is far from the first thing that should be eliminated.
>>
>> Python is a fine language but it fails all the criteria I listed
>> above
>> except the last two.
>>
>> _______________________________________________
>> Wikitech-l mailing list
>> Wikitech-l[at]lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l[at]lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


william.allen.simpson at gmail

Jul 1, 2009, 8:21 AM

Post #70 of 116 (795 views)
Permalink
Re: On templates and programming languages [In reply to]

Hay (Husky) wrote:
> Javascript might have gotten a bad name in the past because of 14-year
> olds who used it to display 'Welcome to my website!' alerts on their
> Geocities homepage, but it's really unfair. Javascript is a very
> flexible and dynamic language that can be written very elegantly.
>
> I urge everyone who still think Javascript is a toy language to read
> Douglas Crockford's excellent article:
>
> http://javascript.crockford.com/javascript.html
>
Not very convincing.... "There are already too many versions. This creates
confusion." "Design Errors" "Lousy Implementations" "Substandard Standard"

"But many opinions of the language are based on its immature forms."
Admittedly true for me. Never want to use it in production again.


> On Wed, Jul 1, 2009 at 10:35 AM, Gregory Maxwell<gmaxwell[at]gmail.com> wrote:
>> ...
>> * Doesn't inflate the number of languages used in the operation of the site

This is the important checkbox, as far as integration with the project (my
first criterion), but is the server side code already running JavaScript?
For serving pages?


>> * Possibility of reuse between server-executed and client-executed
>> (Only JS of the named options meets this criteria)

I'd actually put this down as a negative. In my experience, for security,
clear division between client and server is required. I've participated in
too many projects that thought it would be cool, and then spent a good part
of my time building firewalls between client and server to eliminate bad
assumptions about validity of the other side.

My general rule: coming over the network, presume it's bad data.

Double/quadruple/octuple that for any data that is then executed as a
script. In effect, build an interpreter within the interpreter to validate
the code before execution of the code. Never fun....


>> * Can easily write clear and readable code

Not in my experience. And we have far too many examples of existing JS
already being used in horrid templates, being promulgated in important
areas such as large categories, that don't seem to work consistently, and
don't work at all with JavaScript turned off.

I run Firefox with JS off by default for all wikimedia sites, because of
serious problems in the not so recent past!


_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


william.allen.simpson at gmail

Jul 1, 2009, 8:26 AM

Post #71 of 116 (795 views)
Permalink
Re: On templates and programming languages [In reply to]

William Allen Simpson wrote:
> I run Firefox with JS off by default for all wikimedia sites, because of
> serious problems in the not so recent past!
>
s/recent/distant/

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


lists at schwen

Jul 1, 2009, 8:36 AM

Post #72 of 116 (797 views)
Permalink
Re: On templates and programming languages [In reply to]

>> I run Firefox with JS off by default for all wikimedia sites, because of
>> serious problems in the not so recent past!
> s/recent/distant/

Hooray JavaScript FUD!

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


huskyr at gmail

Jul 1, 2009, 8:37 AM

Post #73 of 116 (797 views)
Permalink
Re: On templates and programming languages [In reply to]

On Wed, Jul 1, 2009 at 5:26 PM, William Allen
Simpson<william.allen.simpson[at]gmail.com> wrote:
> William Allen Simpson wrote:
>> I run Firefox with JS off by default for all wikimedia sites, because of
>> serious problems in the not so recent past!
>>
> s/recent/distant/
I'm sorry that you seem to have such bad experiences with JavaScript.
Still, i don't think your comments are really valid in today's world.
Take a look at 'web 2.0-style' applications, such as Gmail or Google
Maps. Stuff like that would simply be impossible in a web browser
without depending on proprietary technology such as Flash. Recent
effort in all modern webbrowsers (including IE) has gone mostly into
optimizing Javascript engines. Whether you like it or not, Javascript
is here to stay.

Of course, this debate shouldn't really be about what people like or
dislike in a certain programming language. It should be about what the
best option is for Mediawiki template programming. A small script
language serves that goal best, so that leaves us to Lua and
Javascript. Lua is pretty cool too, but isn't as well known as
Javascript, and as far as i know they are pretty similar in most
aspects.

-- Hay

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


mrzmanwiki at gmail

Jul 1, 2009, 10:30 AM

Post #74 of 116 (794 views)
Permalink
Re: On templates and programming languages [In reply to]

Trevor Parscal wrote:
> I'm glad to see I'm not alone. JavaScript can indeed invoke bad
> memories of fragile scripts running in IE5 which are long and awkward
> due to limitations in browser technology at the time. However, anyone
> who has used a modern library like jQuery on a support browser will
> tell you it's very powerful and intuitive while being simple,
> straightforward and actually fun. Any language capable of supporting
> this experience is worth seriously considering as an option for us.
>

Of course, little in the jQuery library would be useful for making
scripts that are executed server-side and output wikitext.

--
Alex (wikipedia:en:User:Mr.Z-man)

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


gmaxwell at gmail

Jul 1, 2009, 10:57 AM

Post #75 of 116 (794 views)
Permalink
Re: On templates and programming languages [In reply to]

On Wed, Jul 1, 2009 at 11:21 AM, William Allen
Simpson<william.allen.simpson[at]gmail.com> wrote:
>>> * Doesn't inflate the number of languages used in the operation of the site
>
> This is the important checkbox, as far as integration with the project (my
> first criterion), but is the server side code already running JavaScript?
> For serving pages?

No but mediawiki and the sites are already chock-full of client side code in JS.

You basically can't do advanced development for MediaWiki or the
wikimedia sites without a degree of familiarity with Javascript due to
client compatibility considerations.

> My general rule: coming over the network, presume it's bad data.

In this case were not talking about the language mediawiki is written
in, we're talking about a language used for server-side content
automation (templates). In that case we'd be assuming the inputs are
toxic just like in the client side case, since everything, including
the code itself came in over the network.

I'll concede that there likely wouldn't be much code reuse, but I'd
attribute that more to the starkly different purpose and the fact that
the server version would have a different API (no DOM, but instead
functions for pulling data out of mediawiki).


> And we have far too many examples of existing JS
> already being used in horrid templates, being promulgated in important
> areas such as large categories, that don't seem to work consistently, and
> don't work at all with JavaScript turned off.
> I run Firefox with JS off by default for all wikimedia sites, because of
> serious problems in the not so recent past!

Fortunately this is a non-issue here: Better server side scripting
enhances the sites ability to operate without requiring scripting on
the client.

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

First page Previous page 1 2 3 4 5 Next page Last page  View All Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.