Bugzilla:Languages

From MozillaWiki
Jump to: navigation, search

This is a basic overview of various possible languages that we could move to, for Bugzilla.

Please put discussion on the discussion page. This page is only for technical pros and cons of each language from the perspective of Bugzilla.

Of all of the ones I (mkanat) have used, I'm most in favor of Ruby as a language, because I think it'd be the fastest to write code in and it seems well-designed for writing large applications. That doesn't necessarily mean it has the best frameworks or libraries available.

In addition to looking at the languages themselves, we should also investigate various tookits available for the languages that would make our development easier. For example, Python has Pylons and lots of other frameworks, and Ruby has Rails and Nitro. If one particular language has a framework that would be perfect for us, that could swing the decision.

The existance of this page does not mean that Bugzilla is abandoning Perl. This is only some research being done, and it may or may not ever happen (and consensus among the developers currently says it won't, but the discussion is probably worth having anyway).

Primary Considerations in Picking a Language

1. Ease of Development: We want to rapidly create new features and not have to wory about little details.

2. Ease of Modification: We want to be able to change the code without too much work. The more "abstraction" opportunities a language or framework offers, the better off we are.

3. Performance: It shouldn't perform significantly worse than current Bugzilla code.

4. Available libraries: We don't want to have to re-write the things that we're using now from CPAN. We also want libraries generally available or built-in to the language that we can use in the future.

5. i18n and l10n: We want to be able to easily localize Bugzilla in different languages without people having to re-write the templates like they do now. Anything supporting GNU Gettext in an easy way would be great.

6. Security: Perl has a Taint mode, which encourages security. We want a language that has good security-oriented features, and that has a community with a history of paying attention to security issues.

7. Enforcement of Good Code: One place where Perl falls down is that it doesn't enforce any good coding standards. A language that does would be welcome .

Frameworks Under Investigation

This lists the particular frameworks that we're actually considering.

  • Perl with Catalyst
    • DBIx::Class for ORM.
  • Perl with CGI::Application
    • Easy short-term goal, stepping stone towards Frameworks like Catalyst, Maypole, Jifty, etc
  • Ruby with Rails
  • Python with Pylons
  • PHP with CakePHP
    • The "serious PHP framework" space it boiling down to one of two these days - Symfony, which yahoo! have used or Zend's Framework which has serious backers like IBM (financially and contributing manpower I believe)

Pros/Cons of Individual Languages

This section goes over the various advantages and disadvantages of the languages themselves, from Bugzilla's perspective.

Perl

Pros

  • Lots of modules available in CPAN. They are very well tested due to distributed test system. Anybody can participate in testing.
  • Relatively fast.
  • mod_perl is very mature and extremely fast.
  • Would make us not have to port, avoiding possible Second System Effect and making any transitions (such as to a web framework) easier.
  • Excellent Unicode handling.
  • Excellent regex support.
  • Core is actively developed but not released as often as Python and Ruby.
  • PAR can create standalone executables.
  • Moose allows complex class definitions, though it isn't part of Perl itself.

Cons

  • See The Problems of Perl.
  • Private Methods aren't well supported require additional scaffolding.
  • Multiple Inheritance can be problematic in extreme edge cases.
  • Certain elements of syntax can be confusing for new users, by long experience in training new Perl users:
    • The difference between () and [].
    • The fact that %var is a () (which is also the array notation) but {} is $var.
    • The fact that subroutine arguments aren't really subroutine arguments, they're just an array that gets passed to a function. (This also brings up confusion on the difference between using $_[1], my $var = shift, and my ($var) = @_.)
    • The fact that $hash{'key'} and $hash{key} are the same.
    • qq[] is a string (as is qq{}, etc.), q[] is a string, though qw() is an array.
    • &sub() is resolved at runtime but sub() is resolved at compile time, except for methods.
    • The conversions from one type to another can sometimes be horrendous to read. Eg: [keys %{ @{ $var } }].
    • $$foo[1] and $foo->[1] mean the same thing.
    • That numbers are compared with "==" but strings are compared with "eq", even though in other places strings are interpreted as numbers if used numerically.
    • Figuring out what's $1, $2, $3, etc. from a regex result. And the fact that $1 and $2 don't get reset if there's no match.
      • Perl 5.10 will have named captures.
    • That Perl errors are in $@ but system errors are in $!, and when to use which one.
    • That Perl doesn't really have a class system, it just has a package system with @ISA or "use base," "bless," and SUPER::.
    • That "my ($var) = @_" will get you the first item of the array, but "my $var = @_" will get you a number.
    • In a hash created with (), if you accidentally have invalid items, you have an invalid hash. That is, you can't really do %hash = (key1 => $cgi->param('unset_param'), key2 => 'something'), because then you'll actually just have an invalid hash. (key1 will equal "key2" and "something" won't even have a real value.) In general it's safer to always make hashrefs when in doubt.
    • In array context, $cgi->param('value') returns an empty list if "value" wasn't passed to the CGI. It doesn't return undef. This is why we have "scalar $cgi->param()" all over the code.

Perl6

Pros

  • Implements many features we'd like to have that Perl5 doesn't have.
  • Would be the easiest language to port our current code to, since it's so similar to Perl5.

Cons

  • Essentially vaporware. There is an interpreter written for it, but it's in Haskell and it's not very popular yet.
  • Perl 6 is still very punctuation-heavy and very influenced by Perl 5.

Python

Pros

  • Quite popular.
  • Stable.
  • Actively developed.
  • Quite a lot of modules (but fewer than CPAN)

Cons

  • Not having curly-braces on "if" statements and other blocks makes it hard to figure out where you are in the block structure without a special editor to help (like Komodo).
  • Poor Unicode handling--strings are ASCII by default, and are Unicode only if you prepend them with u, like u"string".
  • No standard way of installing modules like CPAN. (Cheeseshop and easy_install exist, but they're not universally standard.)
  • Python has no equivalent to Perl's "taint" mode.
  • Sometimes has unclear error messages. Basically only compiler error message is "syntax error"
  • Python lacks variable declarations, which means that invalid variables are caught at runtime instead of compile-time.
  • The "you need to upgrade the whole of python" problem with applications that require or use newer libraries.
  • Doesn't use OS threads, so individual python programs can't take advantage of multi-core or multi-processor systems.
  • Lacks any type declarations (can make code more verbose, as more tests for data validity need to be done in the code.)

Ruby

Pros

  • Extremely modern language, lots of great features built-in to the language.
  • RubyGems, a CPAN-like method of installing modules.
  • Becoming more and more popular.
  • Very easy to learn, development is very natural and very fast.
  • Very actively developed.
  • Has a taint mode, just like Perl.

Cons

  • Not yet as well-known as Perl and other languages (lacks the large userbase of Perl, Python, or PHP)
  • Not installed by default in most distros or in hosting services
  • Good Unicode handling isn't coming out until 1.9, but that's the next release.
  • Performance is apparently considerably slower than Python (something like 2-3x).
  • Interesting Blog Post About Ruby Cons

Java

Pros

  • Excellent "design by contract" features (compared to the other languages in this listing).
  • Very stable.
  • Fast.
  • Popular, although more in Enterprise apps and less in Open Source than scripting languages.
  • Allows for easy web application installation.
  • Adopted by many enterprises.
  • Strongly typed language. This makes it easier to detect errors before runtime.
  • Secure. Policy files can be used to control what the JVM is allowed to do on yur system.

Cons

  • Generally slower to write in than scripting languages.
  • Nothing like CPAN's client-side module installer until Java 7 (see JSR 277).
  • The standard Java Classpath is not FOSS, but Sun's OpenJDK is.
  • Code is sometimes is more verbose than in other languages. Java 5 improves on this.

PHP 5 or 6

May be worth investigating PHP::Interpreter

Pros

  • Extremely popular
  • Easy to setup / get hosting
  • Basically fast
  • mod_php is basically intrinsically as fast as mod_perl
  • Decent object-oriented features
  • There is no "Higher Order PHP" that only experienced hackers can understand.

Cons

  • Lacks strict (variables) mode
    • Note: Enabling PHP E_NOTICE error messages (e.g. in php.ini) does something similar. But PHP security is an issue (many gotchas for the inexperienced) - a framework could help a lot here
  • Lacks a taint mode.
  • Not historically designed for applications in the classic sense, but rather focused on scripting. (This also affects its reputation with application programmers.)
  • Charset encoding is not yet well supported (PHP6 should solve this)
  • PHP is more focused on writing web pages than being a general-purpose language. (Might not be as good for our command-line scripts.)
  • mod_php is stateless and can't store data beyond a single request, but we don't do that now in Bugzilla anyway. It also parses all program for every request, unless an accelerator such as APC or eAccelerator is installed.
  • No separate comparison operators for strings and numbers, so to be sure operators like '===', strval and intval are required.
  • Too many different functions for one purpose
  • No namespace support.

D Programming Language

D

Pros

  • Can compile to native code as fast as C++'s
  • Language support for "Design by Contract"
  • GPL implementation (gdc)
  • Native Unicode character support
  • Easy to learn
  • Can do Imperative/OOP/TMP programming paradigms
  • Under active development
  • Enthusiastic user community

Cons

  • Not as well known as other languages
  • Tool support isn't as comprehensive
  • Not as many libraries available

C#

I hear lots of good things about C#, but I've never actually worked with it. Maybe somebody else could fill this in? I'm be slightly wary of it, since the Windows .NET stack will probably always be ahead of Mono (the *nix .NET stack), and thus the language is essentially controlled by Microsoft.