From MozillaWiki
Jump to: navigation, search


Merging the codebase into is very tricky. We are simply hacking the two codebases to run under one domain with Apache and PHP magic. There are several ways to do this, but we must preserve integrity and performance of all pages.

In order to justify our Apache+PHP solution, here is some raw data of why this merge should be successful.


We want to write some magic to load in pages as top-level URLs on the site. We can do this simply by generating all the top-level folders (about 25) and rewriting all of them to a special PHP page to load in the codebase instead.

However, we also need to merge's htaccess file which contains about 1500 redirects. This is where it gets tricky because big htaccess files are messy, and more importantly incur a performance hit on every single request.

Our quest is to figure out how to get all of this working under one domain and maintain integrity and performance on all pages.

How to read these graphs

These graphics were generated using the Apache benchmark tool `ab`. They show the percentage of requests that were returned in a certain response time in milliseconds. This is a decent indication of the performance of a site.

Unless otherwise stated, these were run with a concurrency rate of 10 and for 5000 requests which is generous enough to get a realistic picture. That means that it ran 10 connections at the same time (opening new ones when old ones finished) until 5000 requests were finished.

It's *very* difficult to accurately benchmark a site, but I ran these several times and found my setup to produce somewhat accurate, predictable results. Don't read too much into these, they should just give a general picture.


Terms used in the keys on these graphs:

  • RewriteMap+PHP - RewriteMap redirects and PHP magic to serve as top-level URLs
    • RewriteMap is an Apache directive to load an external rewrite file
    • This is expected to be the fastest (and cleanest) solution
    • Should not affect current site at all
    • Should load in fast (negligible performance hit)
  • Apache Redirect+PHP - Raw .htaccess redirects (copied from's htaccess) and PHP magic for pages
    • Expected to have a decent performance hit, huge htaccess files are bad
  • PHP - PHP magic for both htaccess redirects and pages
    • Expected to work OK, but implementation must be hacky
  • Pure - Normal or site as it stands now


I'll put a spoiler here: RewriteMap+PHP is a clear winner for merging the site into It's a little hacky, but it works. In fact, gets a 50ms speed increase, which might not sound like much but 1/20th of second is a lot. will be unaffected by the merge. There's nothing that will incur a performance hit.

See bug 670775 for more description about the solution/hack.

htaccess redirects had about 1500 rewrites/redirects in its .htaccess file. We need to port these over somehow, but with that many redirects performance is a major concern. The concern is that every request must process the entire htaccess file, meaning every request would incur a decent performance hit if we don't do this right.

There 4 different ways to port the redirects:

  • RewriteMap(txt) - Use Apache's RewriteMap directive which loads optimized redirects from a text file
  • RewriteMap(dbm) - Use Apache's RewriteMap directive which loads optimized redirects from a database file
  • Apache Redirects - Simply move over the 1500 Redirect/Rewrite lines, use PHP to load in org pages
  • PHP - Implement the redirects in PHP, where the real .org pages are handled

Here are the results tested with a concurrency of 10 for 15000 requests:


If we zoom in some by only testing 5000 requests and increase concurrency to 30:


Conclusion: Obviously RewriteMap is the way to go. It's worth noting that "Apache Redirect+PHP" was the only one that must be avoided since every request would suffer due to the ballooning of htaccess. The other methods can selectively be applied only to pages that don't exist.

We've actually improved the pages' performance a lot since we removed all the htaccess redirects, as we will see in another graph.

Lastly, we can compile the RewriteMap text file into a database if needed, as it performs better under high load. However, the insignificant gains aren't worth worrying about that right now. PHP page

Let's benchmark a normal PHP page: /en-US/firefox/new/.


Conclusion: As expected, only the Apache Redirect solution (copying all htaccess redirects) has a performance hit. Otherwise, current pages shouldn't have a performance hit (negligible, if any). PHP page

Now let's benchmark a PHP page: /community/.


Conclusion: We've actually increased the performance of pages by 50ms! The RewriteMap+PHP solution is an obvious winner. The performance comes from converting the 1500 redirects into a more optimized RewriteMap database file.

Notice the spike in the 90% range, however. Honestly I'm not sure how to explain it, but I think it's an anomaly with my testing environment. There's absolutely no reason why that should happen (5% of the requests take a lot longer to respond). If I actually add all the htacess rewrites from in addition to the RewriteMap database file, it doesn't spike like that. We can look into the spike later and see if it exists on our staging/live servers. These tests also perform requests at a much higher rate than ever sees.

thunderbird PHP page

We had to hack in thunderbird as well (it was just merged in to the current, and I moved the hack up into the codebase). So let's see what happens with the thunderbid url /thunderbird/.


Conclusion: Thunderbird also benefits from our htaccess optimizations, as seen in the RewriteMap+PHP solution. It gets a nice 25ms shaved off each request!

Clearly, RewriteMap+PHP is again the winner. This is fortunate as it also happens to be the cleanest hack out of all of them.