Bugzilla Talk:Languages

From MozillaWiki
Jump to: navigation, search

On Introduction

Typos: tookits -> toolkits, wory -> worry. -knocte

On the majority of Cons sections

I propose to add the "dynamically typed" reason on all Cons sections of all dynamically typed languages (PHP, Perl, Ruby, Python). -knocte


On Perl5 cons

  • Odd how Max has singled out Perl Cons, leaving obvious trolling in there while keeping the discussion of other languages pretty fair and removing flamebait anywhere else. - Aaron
  • Apparently Max's contributions are sancrosanct because even if they are debunked, incorrect, entirely subjective or just plain trolling they can't be removed from the wiki page itself on penalty of locking the page and throwing toys out of the pram. - Aaron
  • Perl 5.10 has many new features (like even faster and better regexps, C3 method resolution for multiple inheritance, named captures).
  • Class::C3 supports C3 method resolution for Perl 5.6 (looked into cpantesters results, and it works in 5.6 too), 5.8 and 5.10.
  • Perl is stable. Anything written for Perl 5.8 should work in Perl 5.10. Perl development version are automatically tested on different OS and by testing CPAN modules on them.
  • You can easily upgrade bundled modules (now in ActiveState Perl too).
  • 5.10 will support assertions in core, and they available now by assertions.pm on CPAN for older perls.
  • You can use Sub::Assert for complex design-by-contract features
  • Troll/Flamebauit removed - Aaron


    • one source of language usage stats: http://www.cs.berkeley.edu/~flab/languages.html - stat of SourceForge
      • Perl has other big place for software - CPAN, so no need to store them on SF or other places. If you would add CPAN, that stat would be different for Perl.
    • In TIOBE JavaScript is less popular than Delphi. It is caused by too simple queries to search engines (only "Perl programming"). Look at this page, it has more complex query (second graph): http://lui.arbingersys.com/index.html . Not it looks much more real for JavaScript. And for Perl too.
    • Also this page about usage stats for Perl: http://www.perlfoundation.org/perl5/index.cgi?usage_statistics
    • Catalyst is a recently new project (first release at the beginning of 2005)
  • Certain syntax things are confusing for new users
    • Granted. But it seems that there would be a strong expectation that anyone programming in a particular language would be far beyond the "new user" stage before they attempted anything big/maintainable/etc -- regardless of the language being used.
      • Ideally, but it's not always the case. And sometimes people have used Perl for years and still don't understand some of the things I listed on the page. But I do see your point there. -mkanat
        • Then they aren't really learning the language - perhaps you mean some people who rarely do Perl and spend more time writing php or python, find that it's a bit different to what they're used to.
  • Perl doesn't check the type of arguments to subroutines.
    • Given that it's a typeless language, what is there to check?
      • Well, it'd be nice to be able to enforce that the argument was a particular class, or be able to enforce that a reference is an arrayref or hashref without having to do that manually. -mkanat
        • There are signatures for limited checking, but it's best to validate parameters properly (plenty of ways to do this from very simple to very powerful on CPAN or manually) instead of just assuming that because something is 'a string' that it's ok. -ajt
        • See for example the 'autobox' module on CPAN (http://search.cpan.org/dist/autobox) that allows you to call $foo->isa('Class') on any scalar. -phaylon
        • Attribute::Signature on CPAN
    • Some parameter check is available. Or use Moose, it can check even better that many languages can.
  • $$foo[1] and $foo->[1] mean the same thing.
    • This is analogous to a C/C++ construct where ix->member and (*ix).member mean the same thing.
      • Yeah, which is also confusing in C to many people. :-) -mkanat
        • No it's just you and a couple of others ;) - ajt
  • You can't make subroutines private in a class.
    • Any method that starts with '_' is considered private. Or use Moose.
    • Not true:
package Foo;

my $private_method_ref = 
    sub { 
         print "hello, I'm a private method\n";
         print "There is no way to call me outside of this package\n"; 
    };

sub public_method
{
  # call private method
  $private_method_ref->();
}
      • True, and I've seen that example many places. Not exactly intuitive, though. And does that work under mod_perl?
        • There is no reason for it not to work under mod_perl, that question makes me think you don't understand how mod_perl works - ajt
        • Private methods aren't really required in Perl very often at all.
        • It is more intuitive when you call the method as a method, and not as a code reference: sub foo { my ($self) = @_; $self->$private_method(@args) }
    • qq[] is a string (as is qq{}, etc.), q[] is a string, though qw() is an array.
      • they each do different things, it's not rocket science - ajt
      • if that confuses you, stick to 'str' and "str" -mst
      • qw() is a shortcut for creating a list, not an array. If these cons are by the same person claiming "long experience" in training Perl users, it starts to smell like either a lie or a vast exaggeration to me. -phaylon
    • &sub() is resolved at runtime but sub() is resolved at compile time, except for methods.
      • &sub() is perl 4 syntax so it behaves weirdly, don't use it in perl 5! -mst
      • sorry, I don't see a problem here - Perl is dynamic - you can create methods and functions post compilation, how do you propose to resolve them at compile time without the Tardis?
    • The conversions from one type to another can sometimes be horrendous to read. Eg: [keys %{ @{ $var } }].
      • ITYM [ keys %{@$var} ], compare that to dereferencing in C or C++.
      • What should this do? If $var contains an array reference containing only a single hash reference, that would store you the keys of that in an anonymous array reference. That's not a type conversion, and I surely hope you don't teach that, since I've never had to use such a construct in Perl. That type conversion above is really two "type conversions" (dereferencing), a function call, and a creation of a new anonymous reference.
    • $$foo[1] and $foo->[1] mean the same thing.
      • so only use the latter, it's clearer
    • That numbers are compared with "==" but strings are compared with "eq", even though in other places strings are interpreted as numbers if used numerically.
      • that's because humans use numbers as numbers sometimes and words at others - Perl reflects actual use, and this has never been a problem for me - it allows you to compare and sort correctly based on your needs/
      • PHP doesn't have separate operators, and it creates problems for inexperienced programmers (they do not use ===, strval, intval).
    • Figuring out what's $1, $2, $3, etc. from a regex result. And the fact that $1 and $2 don't get reset if there's no match.
      • Perl 5.10 has named captures.
      • my ($first, $second) = ($var =~ m/(foo)(bar)/); is preferred syntax unless you're emulating awk. Or stuck in perl 4. -mst
      • figuring out $1, etc is trivial
    • That Perl errors are in $@ but system errors are in $!, and when to use which one.
      • $@ is set by eval, which is called by the developer, and used for capturing or trapping fatal and other serious errors, $! is set by normal code, they do different things that's why they are different.
    • That Perl doesn't really have a class system, it just has a package system with @ISA or "use base," "bless," and SUPER::.
      • FUD. It does have a class system, it's not pure, you don't like it, I do, I also like C++ class system, they are different, and that is good.
      • This also gives possibility to use Moose to get some higher level OO, but still fall back to lower level when it's needed, cleaner, or more maintainable. That's a pro in my opinion. -phaylon
    • That "my ($var) = @_" will get you the first item of the array, but "my $var = @_" will get you a number.
      • yes. learn about context. it's not rocket science - anybody finding this hard would have a terrible time with a strictly typed language.
    • In array context, $cgi->param('value') returns an empty list if "value" wasn't passed to the CGI. It doesn't return undef. This is why we have "scalar $cgi->param()" all over the code.
      • Sorry, poor programming in your code not perl.
      • Any Perl framework will have other methods
      • and this is why nobody uses that interface anymore :) - mst
    • Multiple Inheritance can be problematic in extreme edge cases.
      • Easy then: Don't use MI if you're not suited to it or it doesn't fit the project. Use single inheritance, or roles, or mixins, or traits. It's all available.
      • not if you're using Class::C3 (or the c3 mro in core 5.10), which most modern Catalyst code and all DBIx::Class code -does- use -mst
      • Many languages simply does not have MI
    • The fact that %var is a () (which is also the array notation) but {} is $var.
      • () isn't the array notation. It's a list. my @array = (1, 2, 3); stores a list into an array. The only array notation is [], for array references. So my $foo = [1, 2, 3] is just a short version of my @foo = (1, 2, 3); my $foo = \@foo; perlfaq4 elaborates on this in "What is the difference between a list and an array." -phaylon
    • That is, you can't really do %hash = (key1 => $cgi->param('unset_param'), key2 => 'something'), because then you'll actually just have an invalid hash.
      • No problem there. You just need to know what contexts are and read the documentation for the module you're using. Or simply enforce scalar context.
      • But like we said, the $cgi->param interface is considered harmful. In catalyst you'd do $c->req->param->{unset_params} which would behave itself fine -mst

On Python cons

  • Not having curly-braces on "if" statements and other blocks makes long blocks hard to read.
    • I think this is mostly a red herring. First, you should not write long blocks to begin with. Second, I've used Python almost exclusively in a large project for over three years and this has rarely been a problem.
      • Okay. But if there was one thing I could change about the basic nature of the language, this would be it. It can be very difficult to remember how many spaces you need to put, if you're adding a line after a complex series of blocks. -mkanat
        • That's a feature. Seriously. You're complaining about Perl being too hard to read, and Python is designed so that totally unreadable code won't compile. Do you really want to review patches where someone has managed to get the braces in the right place but the indentation wrong? I'm actually quite surprised to see someone looking primarily for a maintainable language and not putting this in Python's "pro" column. I often hear it called "executable pseudocode" because it's so readable, and the indentation-based syntax contributes to that. -Slamb
          • Yeah, I know. I suppose different people feel different ways about it. We don't have too many problems with indentation anymore, although we used to in older Bugzilla code. It's just hard when there are lines that are two blocks out after the end of one inner block--you can't really tell what block the line is in. -mkanat
  • Poor Unicode handling--strings are ASCII by default, and are Unicode only if you prepend them with u, like u"string".
    • I think u"" can easily be enforced as a coding policy. Depending on how ambitious your Unicode needs are, Python Unicode may not be enough for you. For Chandler we created PyICU to fix cases where Python's natural Unicode support falls short.
      • I was trying to avoid having to enforce code guidelines, though. That is, that's one of the reasons we want to move away from Perl. I can check out PyICU though. -mkanat
    • Just because unicode string literals need a 'u' prepended doesn't mean that Python has poor unicode handling. Most strings in an application are not created through string literals. Plain-ascii strings, by the way, mix with unicode strings without problems. Anything encoded and non-ascii does not, and you need to convert explicitly. If you are writing a new application, making your application unicode-safe is not much of a hassle. You have to watch out on boundaries of your application with the rest of the world, but this is something you cannot safely ignore in any language while still supporting i18n properly. That's not to say Python's unicode handling is perfect; I'd just say it's better than "poor". - faassen
    • If you test from the beginning with non-ASCII test cases you'll do pretty well with unicode. Those non-ASCII test cases are essential though. That said, while Java has better unicode support, Python's support is really pretty good. It is fairly strict about unicode issues, unless you stick to ASCII (at which point you can unfortunately gloss over a lot of errors in your code, which only emerge when some non-English speaker comes along).
  • No standard way of installing modules like CPAN.
    • There is: Python eggs. These are pretty new, though, so not all projects make eggs or upload them to cheeseshop (equivalent to cpan).
      • Ahhh. Yeah, I've seen cheeseshop, but I haven't seen it integrated into *nix distros yet. -mkanat
      • easy_install is best used by developers. People who are just deploying should use other techniques. CPAN is the same way in my experience. For a specific product you can ship all the libraries easily enough, and get a known and controlled set.
  • Python has no equivalent to Perl's "taint" mode.
    • I know of some attempts at this, and I believe Zope has a sandbox thingy as well, so the situation is probably not as bleak as you think.
      • This is really one of my primary concerns. Bugzilla is already used in a lot of situations that require strict security, such as US Government installations. I didn't find anything that would be adequate, in my brief Google search. -mkanat
    • Zope 3 indeed has zope.security (soon coming as an egg near you :). README.txt, code, svn co svn://svn.zope.org/repos/main/zope.security/trunk zope.security) - faassen
      • zope.security is pretty hard to use. taint, however, does far far less than zope.security. Really, can't you just use reasonable security practice? Using the proper abstraction layers like an ORM and SQL injection attacks are unlikely (just be smart about the basic DB-API layer and they are unlikely). eval() is frowned upon, and if you see eval() anywhere you should panic. Most modern Python templating languages auto-quote HTML. Tainted strings aren't necessary to avoid injection attacks.
  • Doesn't use OS threads
    • Python does use OS threads. The reason its threading model cannot make use of multi-core is the global interpreter lock. The typical suggestion is to use multiple processes if you want to scale over multi core. Zope has been doing this for years, with its clusterable object database (ZEO). Doing this will make it more easy to scale towards a cluster as well. - faassen
    • Python has similar or better thread support than Perl and Ruby. It's not as good as Java. Like in Perl and Ruby you don't have to use threads, at which point you can utilize multiple CPUs just like any other system.
  • Sometimes has unclear error messages. Basically only compiler error message is "syntax error"
    • I haven't seen this. Usually syntax errors point pretty close to where the error happens (indentation helps here, as missing braces or "end" typically cause errors only much later in the file).
  • Python lacks variable declarations, which means that invalid variables are caught at runtime instead of compile-time.
    • There's several checkers that can at least catch this specific case (PyChecker, pylint, pyflakes, all catch this).
  • The "you need to upgrade the whole of python" problem with applications that require or use newer libraries.
    • There are a couple options for this. You can use virtualenv or zc.buildout, which install packages in a local isolated environment. You can also implement this pretty easily on your own, just by changing sys.path to use a local set of packages instead of the global set of packages. Though it's unfortunately not common in Linux distribution packages, there's no reason whatsoever for every Python application on a system to use the same set of libraries.

On Java Cons

  • Much slower to write in than scripting languages.
    • This is why most webapps use a scripting language as part of the templating/GUI layer, which is the bit that will require the most customization for most sites
    • Although some tasks can be written very quickly in Perl/Python/Ruby (for example a SOAP client in just a couple lines of code) I'm not sure that is a fair comparison. Although the syntax of those scripting languages is inherently more compact, I don't think this is a fair compairson since the compactness (while impressive in showing off the language) makes the code not only less readable/maintainable, but a good Java IDE or set of Emacs macros will take care of writing/autocompleting most of that for you anyway.
  • The Java Classpath is not FOSS
    • This is both wrong and irrelevent :). The Java Classpath project is GPL'd and even if it wasn't, its a third party project. Last week Sun released *THE* standard JDK as open source code.
    • It should be mentioned that Java's licensing is rapidly changing and I believe Tomcat can actually be run under 100% OSS components
  • Nothing like CPAN's client-side module installer
    • While this is true, it is also completely unfair because Java does not need such a mechanism. There are lots of Perl modules with native code but few third-party Java modules contain any native code, eliminating one of the major third-party module installation headaches. "Installing" a module in a Java webapp is usually as simple as downloading a single JAR file from the module's website and dropping it in your webapp's lib directory. True there is no single repository of Java modules (like CPAN) but finding a module to meet just about any need should not be difficult in this Google age.

Ruby cons

  • Performance is apparently orders of magnitude slower than Python (something like 2-3x).
    • 2-3x is not orders of magnitude, except in bizarro-world where log_10(2) > 1. This is an important distinction: since performance is down at #3 on your list of priorities, I don't think you should be worrying about anything but order of magnitude differences. -Slamb
      • It's an order of magnitude in binary or ternary. :-) The item there originally said 10x, but then was corrected and the order of magnitude comment was not removed. -mkanat

PHP Frameworks

You may want to have a look at eZ Components for PHP:

  • Backed by a middle large, but well known PHP company: eZ Systems
  • A quality first approach: Discussing Requirements, discussing Design, Writing Tests, Coding+Documenting, Reviewing: dev_process
  • Some big names from the PHP core developers are part of the team (Derick Rethans is the project leader)
  • Sebastian Bergmann, the creator of PHPUnit contributed a workflow component that he developed as part of his diploma thesis. You'll certainly need a workflow for bugzilla
  • The Console Tools component helps you with user interaction for shell scripting
  • The company toolslave proposed to contribute it's WebServices libraries to the components. This will give us SOAP and REST

A Java Framework (#1a?)

Here's the rub with Java.. don't do GUIs with it, but by all means, make your business object and services layer with it. If you expose Bugzilla features through well-defined web services (ala WSDL) then anyone can: write GUIs for it (.NET, Python, Java, Ruby, Delphi, ...); script command-line clients; plug it in to their favorite IDE's; etc. Here's one FOSS framework/stack:

  • EJB 3.0 - a specification for Java application servers
  • JBoss Application Server 4.0.5 - an implementation of that specification
    • 4.2 was released yesterday (11 May 2007)
    • Apache Tomcat - a Java-based web server, which also ships with JBoss
      • Apache Axis - web services implementation, talks with EJBs
    • Hibernate - ORM implementation, which also ships with JBoss
      • This lets you work with any database that hibernate works with.. which is most.

That's it for objects and services. But if you want a kind of "reference" implementation of an interface, and something which works closely (locally) with the server-side objects and services, then you could use:

  • JBoss Seam - works closely with business objects and services to provide web interfaces, including some AJAX

There are good options within each layer: application server, web server, persistence/ORM implementation (hence database), GUI, operating system, security services, etc., and that's just the server side. The whole thing is mind-bogglingly complex - too much to consider all at once, but if you build one tower of tools, have a clear domain model and services, pre-defined roles, and an eye for security, then it should build quickly.

To address each point with this set of Java technologies:

  1. Ease of Development: It's not easy to learn, but you can rapidly create new features without worrying about details, especially with EJB 3.0 annotations.
  2. Ease of Modification: Refactoring Java is a joy, and the tools are excellent (Eclipse). Java has long been exposed to agile programming methods, and benefited greatly from unit testing and other change-embracing practices.
  3. Performance: Should be good in normal deployment, but also very scalable.
  4. Available libraries: Your server-side stuff should be taken care of, but also GUI libraries, like for charting on a web interface.
  5. i18n and l10n: Solved for web interfaces.
  6. Security: Server-side security is well established, uses most common authentication and authorization services, and is built into the language. EJB 3.0 in particular uses role-based security for both declarative and programmatic access control.
  7. Enforcement of Good Code: The tools (Eclipse at least) do a good job.

Proposed DB Section

We should discuss what to choose about persistence:

  • Relational database vs object-oriented database (for example DB4O, which can be used with Java or Mono).
  • If relational database is selected, support only one vendor (MySQL?), or n vendors, or use a higher level tool that supports n vendors?
  • If using a higher level tool to support n database engines, use object-relational mapping tool or code generators or stored procedures?

On Frameworks Under Investigation

It would be interesting to add MonoRail. It's a web development framework based on the MVC concept, following RubyOnRails best practices, but with the advantages of the .NET platform: being able to use the .NET API, the many .NET langauges, and the many .NET available libraries (like ActiveRecord or NHibernate). It's also claimed to be a better concept as ASP.NET for web development in the .NET world.

Proposed Mono Section

The Mono Project

Pros

  • Based on the CLR interoperability: every language that compiles to CIL (CLR Intermediate Language) can be interoperable with other languages. This will prevent the problem we have now with Bugzilla about switching the language (if we use a CLR-based language now, we could do a painless and gradual switch to other language in the future). This could also allow to Bugzilla be written in different languages depending on the specific needs of each part.
  • Can be compiled to native code using AOT
  • MIT/LGPL/GPL implementations
  • Easy to learn
  • Outstanding quantity and quality of documentation, thanks to the fact that all .NET related stuff is reusable.
  • Under active development
  • Enthusiastic and huge user communities (.NET and Mono)
  • The majority of the interesting Java frameworks are already ported to .NET (NHibernate, NUnit, NLog, etc.)
  • There is a huge number of libraries and components to be used with this platform.

Cons

  • Much criticised in the OpenSource community.


Available CLR-based languages

Boo

This is a scripting language based on Python language syntax and designed to be used with Mono.

Pros
  • Statically typed language, while keeping the scripting syntax.
Cons
  • The syntactic cons of Python, I suppose.

Nemerle

An ultra-high level language.

Pros
  •  ?? [ To be filled. ]
Cons
  • Not much widely used and known.

C#

A language that could be named as a mix between Java and C++.

Pros
  • It has all the nice features that Java had when was borned, along with others that Sun has acquired later for it (like Generics and Attributes).
  • Provide other features that Java lacks (preprocessor conditionals, unsafe mode for performance goals, operator overloading, etc.).
Cons
  •  ?? [ To be filled. ]

J#

The java language, but CLR-based and used along the Mono API.

Pros
  • It uses the commonly-known syntax of pure Java, while focusing it with Mono so as to be able to combine it with the interoperability that the CLR provides.
  • Statically typed language.
Cons
  • Not much widely used.


VB.NET

The old VB language, redesigned for .NET.

Pros
  • Very popular. Would attract many developers to write plugins/addins.
Cons
  • Although it has type semantics, the compiler often assumes type conversions or doesn't warn about mismatches (then, it's an hybrid, neither static nor dynamic).
  • Awful syntax.


IronPython

The Python language, redesigned to be CLR-based.

Pros
  • Based on a very popular language in open source development.
Cons
  • All the cons of Python I suppose.
  • Immature.


Ruby.NET

The Ruby language, redesigned to be CLR-based.

Pros
  • Based on a very popular language in open source development.
Cons
  • All the cons of Ruby I suppose.
  • Immature.


PHP.NET

The PHP language, redesigned to be CLR-based.

Pros
  • Based on a very popular language in open source development.
Cons
  • All the cons of PHP I suppose.
  • Immature.