From MozillaWiki
Jump to: navigation, search

« Back to Archive | Update: Main

Ambox outdated.png THIS PAGE MAY BE OUTDATED
This article is in parts, or in its entirety, outdated. Hence, the information presented on this page may be incorrect, and should be treated with due caution until this flag has been lifted. Help by editing the article, or discuss its contents on the talk page.



This will be the plan for getting to a solid v1.0 that is consumable by the public.

For v1.0 we will be performing a security audit with the results of that being the basis for any changes made to the v1.0 code base. We should not be updating functionality or adding new features in the v1.0 tree; only security and performance fixes.


The goal of our proposed security audit is to verify security and improve the stability of the existing UMO application.

Our hope is to get everyone on the same page by starting much-needed discussion. From discussion we develop goals, which lead to a plan, then action. Our ability to listen to each other and work together will determine the success of UMO, which directly affects the heart of the Mozilla Foundation.

Goals and Priorities


 Be responsible for the security of the Mozilla community.


 Provide reliable tools that scale gracefully.


 Keep moving the Foundation forward.


 Improve communication between systems and developers to spot potential problems before they happen.

Auditing UMO

Reviewing UMO should be a collaborative effort between system administrators and developers. The following overview focuses on:

  • A developer's view on what a good web app is.
  • Ways we plan to test UMO:
    • Test for remote vulnerabilities using automated test suites.
    • During testing slam staging area with a wide array of inputs (PHP, MySQL, special chars).
    • Look for ways to reduce access to sensitive information. For example - a publicly posted DB schema
  • Ways we could work with sysadmins to improve UMO security.
  • The need for better development practices - particularly with benchmarking and staging.
  • Reduction of code and addition of proper comments and documentation to improve extensibility.
  • Use of standards or style guides to ensure code consistency.
  • Structural improvements aimed to improve the overall quality of UMO.


  • Programmer inexperience is a primary cause of security holes.
  • Use test suites exist to test input validation.
  • Isolate and lockdown the production environment.

Consistent with our priorities, security is our primary concern. UMO has the potential to help or harm whomever it touches. The responsibility is on our shoulders to ensure UMO is the good guy and never the bad.

From a developer's standpoint, security in PHP is really a matter of how experienced an author is. An experienced developer understands the difference between local and remote attacks and adequately prepares for both via sound input validation and cooperation with system administrators.

A good security review would involve automating all input methods and testing each input using an exhaustive combination of SQL injections, PHP function calls or variable references, special characters and other unexpected inputs. Some test suites exist for this purpose, such as TestMaker.

With the use of such an application, it would be possible to create macros to mimic user actions over HTTP and efficiently inject a wide array of inputs. This would be a primary focus - most severe exploits are the result of poor input validation and/or testing.

Once we have reached a relatively safe situation with inputs, we must look at dependencies. Outside of our main script or application, what exists on this server that could pose a potential threat? Many people develop simple scripts such as blogs in distributed environments that might host many applications. In such an environment an application is vulnerable to both local and remote attacks by means of the lowest common denominator.

Isolating UMO from the lowest common denominator is of equal importance. This means limiting access, creating a sandbox for UMO to run in, putting up adequate staging areas that would not affect production in the case of a meltdown, and many other things that are beyond a developer's control and should be handled by sysadmins.

Of note - phpMyAdmin exists in the root of UMO. That is a lowest common denominator; unnecessary software that adds additional vulnerabilities to the system. If someone were using that instance of phpMyAdmin and it crashed Apache or MySQL for any number of unforeseen reasons - is that a situation we should prepare for? Having staging affect production?

Additionally, the entire DB structure for UMO is publicly viewable. This makes it very easy for someone who may attain the ability to inject SQL via tainted input to seriously affect the system. For the same reasons MySQL errors should be caught and written to files rather than displayed on screen. Granted, most open source projects publish their database structure, but it should at least be housed somewhere besides the main web server running the production version of the app.

And really, that's just the beginning. Many things are avoidable by using good practices and clear, concise code. Verbose code, poor whitespace, inadequate comments or lack of developer documentation can all lead to obscure sections of code that never pop up until it's too late.

We understand an audit of UMO requires analyzing every nook and cranny of the application. In addition to questioning the general security principles we have discussed, we will analyze the logic behind each line and ensure there are no security loopholes.


  • Load testing.
  • Proper staging.
  • Cooperative planning by sysadmins and developers.
  • Sound infrastructure and an understanding of how it works.

Reliability of UMO is a vital part of the Foundation. A secure UMO also has to adequately handle its user base as it grows or it runs the risk of turning away users. Graceful scaling is often accomplished through adequate staging, requiring planning and cooperation between sysadmins and developers. There are a variety of tools available such as apache benchmark that can easily handle this job.

Factored into scalability is a disaster plan for extremely high loads during normal operation. This would include disabled non-core services to save cycles. An example would be disabling logging, non-vital dynamic content or eliminating all dynamic content altogether. This again requires testing so server admins and developers can map hardware capabilities based on application configuration.

Lastly, a soundly-written and highly efficient web application remains at the mercy of its supporting hardware infrastructure. That said, any UMO developer should have utmost confidence in and a solid understanding of his/her server infrastructure - and if not, that should be addressed before considering any other reliability issues.

With all of these things in mind, we will assess how well UMO stacks up and work with sysadmins to determine ways to improve overall stability and reliability. Overall, the key here is communication and teamwork between experienced sysadmins and developers.


  • Cooperation between developers and sysadmins to determine optimal conditions under relevant load scenarios.
  • Proper data typing.
  • Proper database normalization.
  • Proper (and not excessive) indexing.
  • Understanding your RDBMS and its limitations.
  • Taking advantage of Apache and persistent connections.

Optimization of database resources is vital to the scalability of a dynamic web application such as UMO. It takes just a little cooperation to tweak things until you get the results you are looking for.

For example, during the scaling of the current UMO service, we found that there were several very expensive MySQL queries that were happening and were causing frequent outages. Upon further examination we found that these SQL statements were being done on columns that had no indexing in place. With the addition of 3 indexes for critical columns we were able to scale the UMO service even further with the existing hardware.

Another example comes from looking at another scaling issue. When a user hits the UMO service they use an SQL statement that is done by date. However, the date column is just a VARCHAR and thus not ideal for doing queries against. In addition, it was not indexed properly thus causing further slowdowns. Using the right data typing in this instance would yield large performance improvements.

Our work on Bouncer has shown how proper indexing and a concrete schema combined with persistent database connections can greatly reduce load on any RDBMS. We will check UMO's database for proper normalization and indexing. We will also ensure there is no database bottleneck for dynamic content.


  • Re-use of code and/or utilization of widely accepted libraries.
  • The less code you have to write, the easier it is to manage.
  • Adequate developer documentation.
  • Project standards such as syntax style, tabs vs. spaces, wrapping, etc.
  • Use a popular RDBMS and scripting language.

In many cases the code in UMO should be reduced, particularly when calling redundant procedures that deserve to be housed in a set of wrapper functions. The following example is representative of the UMO source:

$sql = "SELECT * FROM table1 INNER JOIN table2 ON table1.id = table2.id";
$sql_result = mysql_query($sql, $connection) or trigger_error("MySQL Error ".mysql_errno()." ".mysql_error()."", E_USER_NOTICE);

As an alternative, it would be fairly easy to write a function to accomplish the exact same thing instead of rewriting this block every time a query is performed:

 *	Do a query.
 *	@param resource $dbh MySQL database handle 
 *	@param string $sql MySQL query, unescaped
 *	@return resource $result MySQL result
function db_query($qry=null,$dbh=null)
	static $result = null;
	if(!is_resource($dbh)) $dbh = db_connect();
		if(is_resource($result)) return $result;
		else return false;
		$result = @mysql_query($qry,$dbh) or db_error(mysql_errno());
		return $result;

Here, $dbh is a static handle so if it has already been instantiated, it can be reused for concurrent calls of the query function -- likewise with $result. db_connect() is another wrapper that connects to the database and db_error() is a wrapper to handle database errors gracefully (and silently) - we will not go into that right now, assume it works for the example. This ultimately lets us to reduce the code:

$sql_result = db_query("SELECT * FROM table1 INNER JOIN table2 ON table1.id = table2.id");

This is just a small example - don't run it under the microscope. The idea we want to portray is that many times throughout the UMO source the reduction of code could make scripts cleaner, shorter, and more efficient. It also makes the code easier to read and update for future developers, which directly affects extensibility.

For redevelopment, we will likely decide on using a database abstraction layer such as PEAR or adodb. Either one would follow these same guidelines, and would possibly be used in conjunction with similar wrappers.

Notice the comment blocks as well. It is very important to adequately comment all actions. By doing uniform comment blocks before functions and scripts it is possible to use a code documentation utility like phpDoc to generate a complete developer's API on-the-fly as we did with Bouncer. This also greatly improves the extensibility and overall quality of the application.

Attempting to improve the extensibility of UMO would be a potential focus, but not a primary one given the possible redevelopment of UMO. On the other hand, if reduction of code would lead to improved performance, the methodologies discussed above would be applied.


  • Centralized includes, libraries, configuration documents and documentation.
  • Files and directories named consistently.

The overall structure of the application can play a role in its performance and security. By centralizing code it would be possible to cache or lockdown certain libraries or configuration files as needed. An added advantage is that it becomes a lot easier to know where certain things are if you have a simpler file structure.

The following is a listing of the current root:


This is a listing of Bouncer's root:


It becomes easier to find things by sorting by:

  1. Language
  2. Roles / access level
  3. Action

UMO does not seem to use a consistent naming convention. So if you read the code, it is hard to go through and know exactly where everything is. The lack of documentation also contributes to the confusion. It would be a good idea to reorganize scripts a little better, for example:


Then, when necessary, create virtual hosts to root in particular directory. For example, http://addons.update.mozilla.org/ could point to the root of /pub. Furthermore, it makes it easier to lockdown remote access to certain directories, like sql, cfg, lib and admin. Other possible benefits are checkouts by directory for a given module. If you wanted to cache all images, you could check out the img directory wherever you wanted, etc.

We intend to look at possible improvements here, but probably would not restructure everything in an audit - it is more a development approach than it is a load or security issue. Structure is important for future discussion regarding development, so we wanted to address the inherent benefits.


Planning and Preparation

  • Eliminate areas that will not ever be implemented (until v2.0)
  • Determine which sections of UMO need the most attention
  • Create list of items we will review
  • Clean code for review (PHP_beautifier) - did not make a difference, moved on.

Code Review

  • Document includes, dependencies
  • Review application based on priorities, listing needed changes
    • Imperative security changes
    • Efficiency issues that really need to be changed
    • Efficiency or style issues that are cosmetic
  • Test behavior in identical staging environment for all public use-cases. (morgamic, chenb)

Code Adjustments

  • Patch branch to include local modifications in prod. (under review)
  • Patching
    • Rewrite VersionCheck.php (morgamic)
      • This is currently being reviewed. (mconnor)
    • Update UMO configuration and structure to make development easier. (morgamic)
      • Stripping out stuff like hard-coded paths
      • Adding needed directories to CVS
      • Documenting requirements and installation procedure in Wiki (need help with this)
    • Get PFS into CVS (morgamic, jst, kveton)
      • uniqueUrl.php needs to be excluded
    • Analyze, optimize PFS scripts. (morgamic, pending import being cleared by jst)
    • Prioritize needed work and create TODO lists. (morgamic, alanjstr, pending patches) -- see hitlist below
    • Deal out important patching (based on audit log) to community based on wiki information. (kveton, morgamic,alanjstr)
    • Unfreeze branch and open it up for patching. (kveton)

Hit list

  1. Developer Sessions, making sure login works correctly, nav options for admins are hidden (morgamic)
  2. Comment spam prevention for user ratings, reenabling comments for extensions and themes (chip)
  3. Set up -dev,-staging to automatically pull from branch, trunk(kveton)
  4. QA for submitted patches (Ctho, alanjstr)
  5. Run through remaining items in audit log that are important mainly (but not limited to):
    1. missing input validation
    2. bad loops
  6. Work through workflows for developers and admins (everyone)
  7. Merge branch with trunk (kveton)
  8. Push trunk into production (justdave,kveton)


  • Benchmark code to assure scalability
  • Check for improvements on performance

Audit CVS Information

Line numbers mentioned for the files below may be relative. In order to prevent confusion, the output of 'cvs status' was captured in case they do not match.

View the CVS information page for the complete 'cvs status' of the audited code.

Audit Log

The following is a list of all .php files in UMO. For each file the following will be determined:

  • Is the file live?
  • Will the file be in UMO v1.0?
  • What other scripts affect this file?
  • What other scripts does this file affect?
  • Needed Changes
    • Urgent security changes - things that just have to happen
    • Semi-urgent performance changes - things that could make a big difference in performance
    • Best-case scenario changes - things that probably _should_ be tended to but could wait

General Notes

  • use single quotes for associative array key strings, e.g. $_GET['id'] instead of $_GET[id]
  • use require_once and include_once when appropriate
  • consider using braces when including variables in double quote strings, e.g. "this is a {$thing[$i][$j]}."
  • use foreach to iterate over arrays instead of for loops (for ($i = 0; $i < count($array); $i++))
  • do not query the database inside loops unless there's no choice whatsoever
  • Security Bugs
  • usage of uriparams() is done even if it doesn't make any sense




Make update's search feature not suck




  • use output caching




  • [126] [159] & [319] consider using foreach



  • [44] use require_once; path should be just config.php
  • [48] $_GET[id] => $_GET['id']; consider checking $_GET['id'] and $_GET['commentid'] with PHP before hitting the db with a query
  • [48] [71] consider combining these queries
  • [106] $_GET[pageid] => $_GET['pageid']


  • [44] [64-66] use include_once
  • why is there a function defined in this file? it's a "config" file
  • This config file should use define() for configuration constants, or at least use a $config[] array so that any configuration variables are clearly marked when used during runtime in a script.
  • That said, this file should only define constants and config variables. It should not be performing any tasks. The act of including things like browserdetect should be done by an init script. This allows us to include site configuration information without having to run anything (something we could use in the RDF generators for PFS, VersionCheck).


  • [48] $db (a boolean) is assigned to the returned value of mysql_select_db, but never used anywhere
  • This file is pointless. DB information should just be defined in config.php. (alanjstr says: the problem would be the includes at the bottom of core/config.php. VersionCheck doesn't need all of that)
  • This file should not create the connection. Connections should only be created when they will be used. As it stands, this forces every page on UMO to create a DB connection whether or not it will be used for anything.
  • Someone could use a script, point at this file and create however many unused DB connections they wanted by simply opening it.
  • The proper way to handle db connections would be to explicitly connect to the db in the script that is doing a query. It reduces confusion and also allows for scripts that don't connect just because they include the config/init scripts.


  • [82] use include_once


  • Use Templates
  • ending body and html tags should be in footer document instead of having them randomly scattered throughout the application.


  • [74] [79] can stripslashes() be used in place of str_replace()? (alanjstr says: we need to come up with a list of when to strip and what to strip. This is different for php variables and database entry)
  • strip_tags is not 100% safe; it can be fooled. Perhaps consider kses
  • [86] strtolower() is unnecessary
  • [112] [123] include_once
  • [128] writeFormKey() always produce the same output, so why incur unnecessary db queries for scripts that call it multiple times? Maybe a static variable should be used.
  • [136] unnecessary db queries, similar to above



alanjstr suggests:


  • Remove file location from URI
  • [40] require_once('config.php');
  • [47] need a more graceful way to handle the error. Can we use page_error() from inc_global.php?
  • [57] [95] use strtotime() in place of mktime()
  • [112] does this query have to be executed on every request?


  • [40] use required_once('config.php');
  • [44] $_POST[id] => $_POST['id']
  • [73] maybe a more user friendly error message ("is null" is not for regular folks. :) )
  • [53] [54] [56] strip_tags... not totally safe
  • [55] [58] [61] these conditional can be use in ternaries when assigning values to the variables
  • [84] what if [83] failed and $sql_result is not a valid result resource?
  • [100] [118] use strtotime() in place of mktime()
  • [122] $row[ID] => $row['ID']


  • [40] require_once('config.php');
  • [44] $_GET[id] => $_GET['id']
  • [44-51] this block of code is also in commenthelpful.php [48-55]
  • [85] [91] [94] why the double quote?




alanjstr suggests:Auto detect whether addon is a theme or extension

  • [2-3] [11-13] [485] [495] [675] [775] consider using include_once and require_once
  • [18] what if $_GET['type'] is some random nonempty string?
  • [20] we can fix the above on this line with something like this: $typename = (isset($typearray[$_GET['type']])) ? $typearray[$_GET['type']] : 'Extension';
  • [48] use switch statement or array instead of nested if elses? (just a readability thing)
  • [56] should check $status before calling move_uploaded_file()
  • [57] assignment unnecessary if we use $destination for chmod() on [58]
  • [58] $chmod_result is not used anywhere
  • [98] we could check $buf before passing it to a function
  • [102] [158-159] [199] use page_error()?
  • [108] function definition in the middle of some code
  • [138] $manifestdata[id] => $manifestdata['id']
  • [153] since $item_id can be an empty string from [149], then why are we building this query and executing it regardless?
  • [172] should consider taking the sql out of the loop, and construct just one query
  • [201] blank content?
  • [250] should have a default value in case $typearray[$type] is no good
  • [264 - 269] instead of using $i and concatenating inside the loop, consider putting all the $row['UserEmail'] into an array, then use implode()
  • [266] $email copies $row['UserEmail'] and then that's concatenated into $authors; $email is unnecessary
  • [279] $catid is unnecessary
  • [381] why is this performed again? we have the results from [367] already.
  • [463] I didn't see any input validation for this (regex, strip_tags, etc.), and this value will be simply echoed in moreinfo.php in a <a> tag. What if I enter something like 'javascript:...' or '</a><script>....'?
  • [483] [493] I don't see $failure being used anywhere
  • [525-528] [558-563] instead of doing multiple queries in a loop, we can build just one insert query
  • [666] $sql_result is not used anywhere
  • [691-694] can build just one insert query; $sql_result is not used anywhere


  • [2] [3] use require_once?
  • [11] [12] [18] [150] [230] include_once?


  • [2] [3] use require_once?
  • [11] [13] [22] [34] [318] use include_once?
  • [126-219] there are quite a few queries in this while loop. If possible, move them out of the loop and query for large datasets, and then perform data look up in PHP. For example, we can use 'TAX.ID IN (SET_OF_IDS)' on [132] where the set of ids can be obtained from the results from [124]


  • [2] [3] use require_once
  • [10] should we not include the footer before terminating the script?


  • [3] $_POST[cattype] => $_POST['cattype']
  • [37] $result is never used
  • [131] [141] $sql_result is assigned but never used




  • FAQ Manager Strips out html
  • [78] no checking for $id, but since it doesn't break anything and it's an admin page, probably ok
  • [91] [103] [175] should check for affected rows



  • [2] when $skipqueue is set to "true", we also skip the comment count



  • Item overview lists categories repeatedly
  • [38] what's $v? I don't see it being used anywhere
  • [57-63] we can replace the content of this loop with just $catnames[] = $row['CatName']; then after the loop, $categories = implode(', ', $catnames);
  • [104-106] unnecessary variable assignments if we are just going to use them for echoing on [109]
  • [156] should check for affected rows


  • List Manager for Themes links to Add Item for Extensions
  • [27] what if $_GET['type'] is some random nonempty string?
  • [30] [55] [58] [99] [153] [410] [422] [423] [430] [556] [557] Use single quotes for the associative array keys
  • [99] none of the inputs are validated (this is an admin page though, so it's probably all right).
  • [167] [409] unnecessary variable assignment
  • [175-184] nested loop with queries that can be rewritten to reduce the number of queries to two
  • [523] this variable is not used anywhere after the increment

alanjstr suggests:







  • [42] this variable is not used anywhere



  • [33] should check for affected rows


  • Bug 275157
  • [46] this variable is not used anywhere
  • [197-202] consider replace these nested if elses with a switch statement; the code that follows doesn't concern with the status, but it should.


  • [26] what if $_GET['type'] is some random nonempty string?
  • [106] Use single quotes for the associative array keys
  • [108] check for affected rows


alanjstr suggests:


  • These files are duplicated in /themes. They should really be abstracted and moved up one directory. Otherwise, any changes happening in /themes will have to be re-implemented in /extensions and vice-versa.
  • These files should all be migrated to reduce code:
    • /extensions/authorprofiles.php -> ../authorprofiles.php
    • /extensions/inc_sidebar.php -> ../core/[something]
    • /extensions/index.php -> ../extensions.php
    • /extensions/moreinfo.php -> ../moreinfo.php
    • /extensions/showlist.php -> ../showlist.php


  • [98] should htmlentities($userwebsite) for the href property
  • [123] query inside a loop, move it out
  • [129] what is this mysterious variable?


  • [43] this line is unnecessary if the next is $typename = 'Extensions';
  • should consider output caching




  • [60] this line is unnecessary if the next is $typename = 'Extensions';
  • [161] [316] Use single quotes for the associative array keys
  • [219-229] consider output caching for this


  • In general, it's not in CVS which is bad.
    • We know what part can't be in CVS (uniqueUrl.php), so it will be committed soon.
  • Why isn't this information in a database?
    • If there was a mime-type to file data map along with some mappings for compatibility, this could be a much simpler script.
    • On that note, why can't the existing UMO database be used with the addition of a mime-type table and a mime-type to version map?
    • If a query with LIMIT 1 could happen, the overhead for this script would be reduced because there wouldn't be so many preg_match calls and string comparisons, which gets pretty expensive when you've got them in a series of if-else's.
  • How do people update their plug-ins? Can they? (doesn't look like it)
  • Logging is going to


  • Would be nice to move this information into a database.


  • [61-78] PHP is not for serving files. Is there a better way to get the user to their file?


  • [57][92] Functions that are declared but never used.
  • [50] The infamous bail() function has no purpose.
  • [42-45] DB information is stored in this file. Why???
  • [47] This should be a require_once().
  • [394] This could be done more thoroughly with a built-in PHP function instead of using preg_replace.


  • Can't touch this. Dun dun dun dun dun dun dun dun.





see comments on /extensions. The code are basically identical.








  • Versions should not be mathematically illogical. i.e. 2.10 should not be a higher version than 2.2! ... :| (alanjstr says: it is in the Gnu and Mozilla World. Take a look at the bugzilla product)
  • [42] DB connection should not happen until after all inputs have been determined and validated
  • [49] Header sent before contents are determined
  • [52] This is a pointless wrapper.
  • [63,113,124,138,144,265] bail() is questionable; shouldn't we let things die quietly? (alanjstr says: can we make bail() output valid XML, maybe outputting the correct format of input?)
  • [59+] Functions defined during script - this should be separated; the logic behind the script is hard to determine because function definitions interrupt it
  • [127-132] This seemingly arbitrary variable renaming makes it hard to follow where data originates.
  • [179-208]
    • use $_SERVER['HTTP_USER_AGENT'] instead of calling getenv multiple times
    • Check to see if HTTP_USER_AGENT is empty or not before doing any checks on it
    • > 0 is not sufficient; it should be !== FALSE since strpos can return 0 as a valid return
    • Code comments indicate 1-6, checks provide IDs 0,2,3,4,5,6 (alanjstr says: replace this with names, not ints)
    • Use logic to create short-circuit; right now it's all if's so the checks will happen even after a match is found and $osid is set -- logically, it's possible for $osid to be set twice
  • [234] Query in a loop. Remove this. This could be replaced with an IN( implode(','$reqItemGuid) ) instead of $reqItemGuid[$i] in the query (assuming some checks on $reqItemGuid before the implode()), reducing the number of queries from count($reqItemGuid) to 1 since $osid and $reqTargetAppGuid are the same. (alanjstr says: good catch. This is new code, so it only handles 1 at a time right now)
  • [253] Is there any way to avoid this OR ?
  • The mime-type "text/rdf" should be "text/xml".