Build:Release Automation: Difference between revisions
(notes on which steps need sync) |
|||
| Line 319: | Line 319: | ||
==Bugs== | ==Bugs== | ||
* Need to use cvs (in master.cfg) from ShellCommand to make sure that we always use the proper bootstrap tag | |||
===Bootstrap=== | ===Bootstrap=== | ||
* (needs bug filed) FTP area keeps getting set read-only; could be a bug in the rsync from the build machines, or maybe in the initial staging FTP area setup? | * (needs bug filed) FTP area keeps getting set read-only; could be a bug in the rsync from the build machines, or maybe in the initial staging FTP area setup? | ||
** | ** (build) permissions on all ${os}_info.txt files were read-only user | ||
* (needs bug filed) in private repos, in /mofo/release/stage/firefox-src-tarball-nobuild, there is a script that is called in the "source" step. Has hardcoded CVSROOT which needs to be updated to use ":ext:cltbld@staging-build-console.build.mozilla.org:/builds/cvsmirror/cvsroot" | ** (updates) permissions on AUS config (snippets) and partial MARs were wrong | ||
** (stage) permissions problems for non-stage-merged dirs; did 775 for dirs an 664 for files. group perms seem ok, and batch-skel/stage-merged ok as well. | |||
* (needs bug filed) bootstrap needs to automatically sync with stage | |||
** build, repack, sign, updates, stage | |||
* ([https://bugzilla.mozilla.org/show_bug.cgi?id=394034 bug 394034]) in private repos, in /mofo/release/stage/firefox-src-tarball-nobuild, there is a script that is called in the "source" step. Has hardcoded CVSROOT which needs to be updated to use ":ext:cltbld@staging-build-console.build.mozilla.org:/builds/cvsmirror/cvsroot" | |||
** might be better to just delete this script and create a makefile target to tar up files. This is the only file we use from the private repo (we think), so if this file is deleted, we can stop using the private repo. | ** might be better to just delete this script and create a makefile target to tar up files. This is the only file we use from the private repo (we think), so if this file is deleted, we can stop using the private repo. | ||
** also this is currently a problem because build-console cannot access anonymous CVS | |||
* ([https://bugzilla.mozilla.org/show_bug.cgi?id=373995 bug 373995]) l10n needs the URL it downloads builds from to be configurable as well | * ([https://bugzilla.mozilla.org/show_bug.cgi?id=373995 bug 373995]) l10n needs the URL it downloads builds from to be configurable as well | ||
* ([https://bugzilla.mozilla.org/show_bug.cgi?id=373995 bug 373995]) update verification needs config file; needs to be rewritten to use patcher.cfg | * ([https://bugzilla.mozilla.org/show_bug.cgi?id=373995 bug 373995]) update verification needs config file; needs to be rewritten to use patcher.cfg | ||
| Line 351: | Line 358: | ||
===Tinderbox/Build bugs=== | ===Tinderbox/Build bugs=== | ||
* (needs bug filed) "scp -r" does not work on pacifica-vm; needed for l10n | |||
* (needs bug filed) tinderbox symbol server should be configurable | * (needs bug filed) tinderbox symbol server should be configurable | ||
** temp workaround: tinderbox Makefile.in needs to be hacked: | ** temp workaround: tinderbox Makefile.in needs to be hacked: | ||
FC_TUNNEL = ssh -$(FC_SSH_VERSION) -f -L 8080:hal:80 $(LSSH_USER)staging-build-console.build.mozilla.org sleep 20 | FC_TUNNEL = ssh -$(FC_SSH_VERSION) -f -L 8080:hal:80 $(LSSH_USER)staging-build-console.build.mozilla.org sleep 20 | ||
SYM_TUNNEL = ssh -$(SYM_SSH_VERSION) -f -L 2222:localhost:22 $(LSSH_USER)staging-build-console.build.mozilla.org sleep 20 | SYM_TUNNEL = ssh -$(SYM_SSH_VERSION) -f -L 2222:localhost:22 $(LSSH_USER)staging-build-console.build.mozilla.org sleep 20 | ||
Revision as of 14:15, 29 August 2007
Intro
Firefox and Thunderbird releases are currently done using the Bootstrap automation scripts, which call into Tinderbox client to do the actual build.
Currently, a human operator must:
- log into the appropriate machine
- check out bootstrap
- run the appropriate bootstrap command
This must be done ~7 times (once per machine), in the right order, to produce a successful release.
Work is ongoing to enable Buildbot to drive the release automation, to enable an "end-to-end" run without human involvement to move the process from step-to-step and machine-to-machine.
Bootstrap
Bootstrap is a simple Perl framework intended to take the formerly manual release process and automate it, with as little change to the process as possible.
Bootstrap is invoked using the "release" command, and supports a set of high-level "steps":
Tag - tag, branch, apply version bumps to all relevant files.
TinderConfig - generate tinderbox config files (mozconfig/tinder-config.pl)
Build - invoke Tinderbox client to create and en-US build and publish to FTP
Source - create a source tarball and push it to FTP
Repack - invoke Tinderbox client to create localized versions of en-US build and publish to FTP
PatcherConfig - create a Patcher config file for generating updates
Updates - invoke Patcher to create partial updates and AUS configuration
Stage - create a staging area and rename files for release
Sign - not implemented
Bootstrap Steps
A Bootstrap "step" must implement 2 required methods:
Execute - carry out the actual function of the step, e.g. Build
Verify - run an automated test
Additionally, there are 2 optional methods:
Push - upload the appropriate changes for testing, e.g. upload build to FTP
Announce - send an email announcing that the step has finished.
Using Bootstrap
If the "release" command is invoked with no parameters, it will attempt to start at the first step and call the methods in this order:
- Execute
- Verify
- Push
- Announce
As each step completes successfully, the next will be invoked.
There are several command-line options, shown by calling "release -h":
Usage: release [-l] [-s Step] [-o Step] [-e | -v | -p | -a] [-h]
-l list all Steps
-s start at Step
-o only run one Step
-e only run Execute
-v only run Verify
-p only run Push
-a only run Announce
-h this usage message
For example, to only run the Push method on the Build step:
./release -o Build -p
Roles and resource requirements
- buildbot master
- keeps logs, manages overall process
- ftp/stage.m.o
- fileserver, both public and private areas
- FTP candidates - 20GB storage
- e.g. stage:/home/ftp/pub/firefox/nightly/2.0.0.4-candidates/
- FTP private staging - 20GB storage
- e.g. stage:firefox-2.0.0.4/
- FTP release - 6GB storage
- e.g. stage:/home/ftp/pub/firefox/releases/2.0.0.4/
- "tagging" builder
- checks out source and applies tag
- 2GB storage
- e.g. karma:/builds/tags/FIREFOX_2_0_0_4_RELEASE/
- "source archive" builder
- builds source archive and pushes for QA
- "linux/mac/win32 firefox builders"
- builds firefox and pushes for QA
- needs 2GB memory, 6GB storage (each)
- e.g. prometheus-vm:/builds/tinderbox/Fx-Mozilla1.8-Release/
- "updates builder"
- downloads and inventories a set of complete firefox updates, generates partial updates, creates AUS configuration ("snippets")
- updates - 1GB memory, 5GB storage
- e.g. prometheus-vm:/builds/updates/firefox-2.0.0.4/
- "stage builder"
- creates private staging area on FTP, renames files for release
- see "fileserver" requirements, above
- Automatic Update Server (AUS), aus2.m.o
- 10GB for config files, backups and staging area
- e.g. /opt/aus2/incoming/3/Firefox/2.0.0.4/, /opt/aus2/snippets/staging/20070523-Fx-2.0.0.4/, /opt/aus2/snippets/backup/20070611-1-pre-20070611-Fx-2.0.0.4.tar.bz2
Buildbot
We have a vendor branch in mozilla/tools/buildbot, based on Buildbot's 0.7.5 release.
Mozilla-specific Buildbot install instructions
Notes on staging setup
Buildbot master basedir is ~buildmaster/TestBot
The bootstrap.cfg is pulled from the master dir.
Slaves basedirs are in cltbld's home directory on the appropriate machine, e.g. ~cltbld/linux-slave1
Changes can be inserted with "buildbot sendchange" on the master e.g.:
buildbot sendchange --master=localhost:9989 -u rhelmer -m"latest bootstrap from CVS" test
Bootstrap uses a local CVS mirror, and the "tag", "source", "updates", and "stage" builders are run by a local buildslave.
The bootstrap Makefile has the following targets:
- stage/clean_stage
- create/remove basic fileserver/tag/source/updates/stage environment
- cvsmirror/clean_cvsmirror
- create/remove cvsmirror in /builds/cvsmirror
These targets are hard-coded to prepare for a 2.0.0.4 release.
There must be "cltbld" and "symbols" accounts on the staging FTP server that the build machines' cltbld accounts can connect to via SSH without a password.
- must accept staging-build-console's hostkey via this SSH tunnel:
- set up staging FTP server
mkdir /home/ftp /builds /data/cltbld chown cltbld /home/ftp /builds/ /data/cltbld cvs co /mofo/release/stage/ to /data/cltbld/bin groupadd firefox
- set up staging AUS server
# TODO - auto-update mkdir -p /opt/aus2/snippets/staging/backup /opt/aus2/incoming /opt/aus2/app
# check out aus2 cd /opt/aus2/ cvs -d /builds/cvsmirror/cvsroot/ co -d app/ -r AUS2_PRODUCTION mozilla/webtools/aus/xml cd app && ln -s ../incoming ./data # install apache yum install httpd
Production setup HOWTO for linux/mac/win32
- build-console setup
- check out /mofo/release/stage to /data/cltbld/bin
- NOTE - this is for the firefox-src-tarball-nobuild script, which checks out a tag from CVS and creates a source archive. This should be reimplemented in the bootstrap Source step
- check out /mofo/release/stage to /data/cltbld/bin
- (Win32/Mac only) install Config::General
cd /tools/dist wget http://search.cpan.org/CPAN/authors/id/T/TL/TLINDEN/Config-General-2.33.tar.gz tar xfvz Config-General-2.33.tar.gz cd Config-General-2.33 perl Makefile.PL
its ok to ignore the warning from "perl Makefile.PL": Warning: the following files are missing in your kit: t/test.rc.out
sudo make install
- (Linux only) prepend custom GCC to the path in ~/.bash_profile
export PATH="/usr/gcc-3.3.2rh/bin:/opt/local/bin:/tools/buildbot/bin:/tools/twisted/bin:/tools/twisted-core/bin:$PYTHONHOME/bin:$PATH"
- create logs dir
$ mkdir -p /tools/dist/logs $ mkdir -p /builds/logs
- (Mac only) Install 7z. You can download it. Or you can copy it from bm-xserve01, which is what we did here. By putting the file in /usr/bin, it is automatically on the PATH of cltbld's .profile.
$ cd /usr/bin $ sudo rsync -av cltbld@bm-xserve01.build.mozilla.org:/usr/local/bin/7z .
- look for Tinderbox directory
#linux: if tinderbox name is not "Fx-Mozilla1.8-Release" exactly, symlink it ln -s /builds/tinderbox/Fx-Mozilla1.8-release /builds/tinderbox/Fx-Mozilla1.8-Release
Check out tinderbox configs:
# win32 cvs -d cltbld@cvs.mozilla.org:/cvsroot co -r MOZILLA_1_8_BRANCH_release -d tinderbox-configs mozilla/tools/tinderbox-configs/firefox/win32 # linux cvs -d cltbld@cvs.mozilla.org:/cvsroot co -r MOZILLA_1_8_BRANCH_release -d tinderbox-configs mozilla/tools/tinderbox-configs/firefox/linux # macosx cvs -d cltbld@cvs.mozilla.org:/cvsroot co -r MOZILLA_1_8_BRANCH_release -d tinderbox-configs mozilla/tools/tinderbox-configs/firefox/macosx
- set up Tinderbox l10n build directory
# linux cd /builds/tinderbox/ # win32 cd /cygdrive/c/builds/tinderbox/
mkdir Fx-Mozilla-1.8-l10n-Release cd Fx-Mozilla-1.8-l10n-Release ../mozilla/tools/tinderbox/install-links rm build-seamonkey.pl ln -s ../mozilla/tools/tinderbox/build-firefox.pl . ln -s build-firefox.pl build-seamonkey.pl rm post-mozilla.pl ln -s post-mozilla-release.pl post-mozilla.pl
Check out tinderbox configs:
# win32 cvs -d cltbld@cvs.mozilla.org:/cvsroot co -r MOZILLA_1_8_BRANCH_l10n_release -d tinderbox-configs mozilla/tools/tinderbox-configs/firefox/win32 # linux cvs -d cltbld@cvs.mozilla.org:/cvsroot co -r MOZILLA_1_8_BRANCH_l10n_release -d tinderbox-configs mozilla/tools/tinderbox-configs/firefox/linux # macosx cvs -d cltbld@cvs.mozilla.org:/cvsroot co -r MOZILLA_1_8_BRANCH_l10n_release -d tinderbox-configs mozilla/tools/tinderbox-configs/firefox/macosx
ln -s tinderbox-configs/mozconfig . ln -s tinderbox-configs/tinder-config.pl .
- Install buildbot
- running as "cltbld", install slave
#linux $ cd ~ $ buildbot create linux-slave1 build-console.build.mozilla.org:9989 linux-slave1 password #win32 c:\\buildtools\\python24\\scripts\\buildbot create-slave c:\\win32-slave1 build-console.build.mozilla.org:9989 win32-slave1 password
- edit the admin and host pages in ~/linux-slave1/info/
- start slave
#linux buildbot start /home/cltbld/linux-slave1 # win32 c:\\buildtools\\python24\\scripts\\buildbot start c:\\win32-slave1
Just for testing
- build-console
- use "stage" target in bootstrap's Makefile
- Move prod ssh keys out of the way, and copy in "staging" keys:
cd ~ mv ~/.ssh ~/ssh.prod scp cltbld@staging-prometheus-vm:~/.ssh/id_rsa .ssh/
- Move prod tinderbox-configs and put staging-build-console in Root:
# win32 cd /cygdrive/c/builds/tinderbox/Fx-Mozilla-1.8-Release # linux cd /builds/tinderbox/Fx-Mozilla-1.8-Release
cp -rp tinderbox-configs tinderbox-configs.prod # change root to cltbld@staging-build-console.build.mozilla.org:/builds/cvsmirror/cvsroot vi tinderbox-configs/CVS/Root
Same for l10n tinderbox build directories:
# win32 cd /cygdrive/c/builds/tinderbox/Fx-Mozilla-1.8-l10n-Release # linux cd /builds/tinderbox/Fx-Mozilla-1.8-l10n-Release
cp -rp tinderbox-configs tinderbox-configs.prod # change root to cltbld@staging-build-console.build.mozilla.org:/builds/cvsmirror/cvsroot vi tinderbox-configs/CVS/Root
- /data/cltbld/bin/firefox-src-tarball-nobuild has a hardcoded CVSROOT; change it to cltbld@staging-build-console.build.mozilla.org:/builds/cvsmirror/cvsroot
Production changes
Changing roles
- move to dedicated machines, e.g. production-prometheus-vm
- CVS tag on linux slave or on build-console? build-console
- l10nverify on mac slave ok, need to fix "unpack all xpis bug"
- Mac - is identical hardware req'd? What happens if prod hardware dies? fireball still worked on, scarce PPC hardware options.
Available PPCs:
- 01 - head node
- 02 - production
- 03 - given to community
- 04 - 1.8.0
- 05 - dead
- 06 - given to community
- fireball - unknown
discussed: planned switch to Intel.
Later, more PPC hardware brought online, so decided to not switch to Intel as part of the automation rollout.
Staging/Production Buildbot master differences
- Signing - prod waits for signed bits, stage fakes w/ symlink ok
- Bootstrap - prod pulls tag e.g. RELEASE_AUTOMATION_M5, staging pulls tip ok
Outstanding issues
- How to handle bootstrap logs.. remove them between runs? Don't want accumulation on slaves remove at start
- How to do mock release.. fake version (e.g. 1.2.3.4)? Early 2.0.0.7, that we know we won't release? 2007 rc1
- "Source" and "Staging" steps - install a buildslave on stage, or stage everything on build-console? use build-console
- Make sure QA checks e.g. top 5 extensions after Mac Intel switch
Caveats
Manual steps
NOTE - manual steps should be done in this order
- bootstrap configuration
- kicking off buildbot ("buildbot sendchange ...")
- update verification config (working on this in bug 373995. For now, need to modify and check in the appropriate update configs, after all en-US builds but before updates
- win32 signing, after win32 l10n repack but before updates
- final installer signing
Bugs
- Need to use cvs (in master.cfg) from ShellCommand to make sure that we always use the proper bootstrap tag
Bootstrap
- (needs bug filed) FTP area keeps getting set read-only; could be a bug in the rsync from the build machines, or maybe in the initial staging FTP area setup?
- (build) permissions on all ${os}_info.txt files were read-only user
- (updates) permissions on AUS config (snippets) and partial MARs were wrong
- (stage) permissions problems for non-stage-merged dirs; did 775 for dirs an 664 for files. group perms seem ok, and batch-skel/stage-merged ok as well.
- (needs bug filed) bootstrap needs to automatically sync with stage
- build, repack, sign, updates, stage
- (bug 394034) in private repos, in /mofo/release/stage/firefox-src-tarball-nobuild, there is a script that is called in the "source" step. Has hardcoded CVSROOT which needs to be updated to use ":ext:cltbld@staging-build-console.build.mozilla.org:/builds/cvsmirror/cvsroot"
- might be better to just delete this script and create a makefile target to tar up files. This is the only file we use from the private repo (we think), so if this file is deleted, we can stop using the private repo.
- also this is currently a problem because build-console cannot access anonymous CVS
- (bug 373995) l10n needs the URL it downloads builds from to be configurable as well
- (bug 373995) update verification needs config file; needs to be rewritten to use patcher.cfg
- (enhancement)(needs bug filed) should set buildbot up to mail based on any failures, currently just depend on bootstrap
Buildbot bugs
- buildbot default timeout too short. 5sec isnt always enough, and you can get a "timed out" message in the slave logs, even though slave started "normally". buildbot bug#68.
- sometimes buildmaster sees buildslave correctly, confirms ping ok, but never assigns pending work to the slave. Doing "buildmaster refresh" is not enough, you need to do "buildmaster stop/start". Restarting the slave does not help. buildbot bug#85
- on win32, console output is not logged (goes to the DOS console running buildbot :( )
- file buildbot bug to handle kill on win32. Add details linking to bsmedberg fix. buildbot bug#77
- link to history for old builds at bottom of page (ala tinderbox server). buildbot bug#67
- meta-refresh tag for waterfall page buildbot bug#69
- buildbot UI to contain way to force build dependent steps instead of just doing current step. buildbot bug#78
- When using the CVS Source step on a Mac OSX slave, if a CVS directory is found on the path, buildbot will attempt to use it as if it were a CVS binary.
- steps which start within a few seconds of each other show as same start time on waterfall page buildbot bug#88
Tinderbox/Build bugs
- (needs bug filed) "scp -r" does not work on pacifica-vm; needed for l10n
- (needs bug filed) tinderbox symbol server should be configurable
- temp workaround: tinderbox Makefile.in needs to be hacked:
FC_TUNNEL = ssh -$(FC_SSH_VERSION) -f -L 8080:hal:80 $(LSSH_USER)staging-build-console.build.mozilla.org sleep 20 SYM_TUNNEL = ssh -$(SYM_SSH_VERSION) -f -L 2222:localhost:22 $(LSSH_USER)staging-build-console.build.mozilla.org sleep 20