ExtractDBscript

From MozillaWiki
Jump to: navigation, search

The two scripts, extractPartialDB and loadExtractDB are a matched pair. The first extracts a database from a copy of production with the specified number of weeks of data. The second will load the resulting TGZ archive, which presumably has been copied to a second server.

Both of the scripts below require the following:

  • psql, pg_dump, pg_dumpall and pg_restore must be in the users $PATH.
  • the user must be able to connect as the "postgres" superuser
  • psycopg2 must be installed and working

extractMiniDB.py

Usage: extractMiniDB.py noWeeks (optional)

Example: ./extractMiniDB.py 3

If noWeeks is not supplied, the extract script defaults to 2 weeks. Note that the number of weeks will round to the next higher complete week according to the database partitioning, so you are likely to get an additional partial week.

This creates a file called extractDB.tgz. While it is working, it may require up to 10GB of free disk space, or more if you are extracting a lot of weeks, so run it on a filesystem with plenty of free disk.

This script needs to be run on the server from which you are extracting data, usually Master02. It needs to be run as the database superuser in the shell (usually "postgres").

loadMiniDBonDev.py

Usage: loadMiniDBonDev.py [ filename ] [ databasename ]

Example: ./loadMiniDBonDev.py extractDB.tgz breakpad

Filename defaults to "extractDB.tgz". Databasename defaults to "breakpad".

This script loads a miniDB database archive created with extractMiniDB.py. WARNING: any database of "databasename" will be dropped and recreated at the beginning of the script. It generally takes two to five hours to run. It will give you lots of output in the shell while running.

Must be run on the server you are loading. You must run it as the database superuser on the shell (usually "postgres"). You must run it on DevDB currently, due to dependancies on an experimental version of PostgreSQL (9.1 with patches) which is not present on other servers.

Depreciated Scripts

The below scripts have been superceded by later versions.

extractPartialDB.py

Usage: extractPartialDB.py noWeeks (optional)

Example: ./extractPartialDB.py 4

If noWeeks is not supplied, the extract script defaults to 2 weeks. Note that the number of weeks will round to the next higher complete week according to the database partitioning, so you are likely to get an additional partial week.

This creates a file called extractDB.tgz. While it is working, it may require up to 10GB of free disk space, or more if you are extracting a lot of weeks, so run it on a filesystem with plenty of fee disk.

This script needs to be run on the server from which you are extracting data.

loadExtractDB.py

Usage: loadExtractDB.py filename databasename

Example: loadExtractDB.py extractDB.tgz breakpad2

Both parameters are optional, and default to "extractdb.tgz" and "breakpad" respectively.

This script needs to be run on the server where you are loading the extracted database. It will take one to 4 hours to run, depending on the extract size and the target hardware. Like extractPartialDB, it will require several GB of extra disk space in the current directory while running.

Important: loadExtractDB drops and recreates the target database. It will delete any data in a database of the given name.