ReleaseEngineering/How To/Set Up a Freshly Imaged Slave: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
Line 39: Line 39:
# reboot it.
# reboot it.


= Being deprecated =
= Deprecated =
== Windows 2008 64-bit (MDT & unmanaged)==
== Windows 2008 64-bit (MDT & unmanaged)==
These machines are set up almost all the way with Group policy and only this is required to be setup after re-imaging:
These machines are set up almost all the way with Group policy and only this is required to be setup after re-imaging:
* follow the [[ReferencePlatforms/Win64#Post-reimaging_steps|post reimaging]] steps.
* follow the [[ReferencePlatforms/Win64#Post-reimaging_steps|post reimaging]] steps.
== Windows 2003 (soon to be obsoleted) ==
=== Activation ===
Nothing to be done but keeping track of it.
Windows 2003 already comes pre-activated. You can check with:
oobe/msoobe /a
=== Hostname ===
* change the hostname by following the steps on [[ReleaseEngineering/Set_Up_a_Freshly_Imaged_Slave#How_to_fix_the_hostname_for_Windows|How to fix the hostname for Windows]].
=== OPSI ===
* No action required
The entry of your slave on OPSI has been created from a template which has a package called "passwordupdate". That package is set to run "always" which ensures that the snapshot could have an older password and be updated immediately to the current ones.
== Windows XP (OPSI partially)==
=== tasklist ===
Make sure that you can run the command tasklist.
If you can't, ask IT to re-image again.
This issue is documented in their imaging instructions:
https://mana.mozilla.org/wiki/pages/viewpage.action?pageId=28575847
=== Hostname ===
* change the hostname by following the steps on [[ReleaseEngineering/How_To/Set_Up_a_Freshly_Imaged_Slave#How_to_fix_the_hostname_for_Windows]].
If you don't add the DNS change for Windows slaves using OPSI you will most likely get a [[ReleaseEngineering/OPSI#Mit_Netzlaufwerken_verbinden.2C_bitte_noch_etwas_warten|Mit_Netzlaufwerken_verbinden]] error before the machine logs in.
=== OPSI ===
* No action required
The entry of your slave on OPSI has been created from a template which has a package called "passwordupdate". That package is set to run "always" which ensures that the snapshot could have an older password and be updated immediately to the current ones.
== Windows 7 (unmanaged) ==
The [[ReferencePlatforms/Test/Win7|test reference platform]] is fairly complete.
=== Activation ===
Win7 will need to be activated.  IT should have done this, but check by going to Control Panel -> System -> Activate Windows - a failure to activate will burn builds later.
If it is not activated asked IT to do so.
=== Hostname ===
* change the hostname by following the steps on [[ReleaseEngineering/Set_Up_a_Freshly_Imaged_Slave#How_to_fix_the_hostname_for_Windows]].
== Linux/Mac (old puppet) ==
=== hostname verification ===
==== Linux ====
Verify the hostname, checking that it ends in 'build.(datacenter).mozilla.com':
hostname --fqdn
To fix it:
* "su -" to become root.
* edit the file /etc/sysconfig/network
** changing the hostname to the host's *long* (with datacenter) fully qualified domain name.
* reboot before running puppet
==== Mac ====
Verify the hostname, checking that it ends in 'build.(datacenter).mozilla.com':
hostname
To fix it ('''su -''' to become root):
* run the following: '''scutil --set HostName XXX'''
'''armenzg: TODO: Who knows why this step is needed for?'''
'''From my experience, even though the hostname looks like talos-r3-leopard-ref (XX) 1) Web Sharing, 2) Remote login and 3) Remote Management seems to have the correct HostName after running scutil and having rebooted'''
* open '''System Preferences -> Sharing''' and change host name there
From Armen's experience this does not require intervention:
* '''Note''': cltbld user is listed for auto-login in the '''System Preferences -> Accounts-->Login Options''' dialog
Aki couldn't get CotVNC to work:
* '''cmd-K vnc://...''' on mac finder
=== puppet ===
Note that initial setup of puppet on slaves is very different than from buildbot masters. On slaves, the daemon is not run, rather updates are polled for when it won't impact jobs. Do not enable the standard puppet service daemon on slaves.
To find the correct master to use with your slave(s), consult the [[ReleaseEngineering/Puppet/Usage#PuppetServers|puppet server]] list.  If your slave isn't using a PuppetAgain master, you'll have to adjust /etc/sysconfig/puppet manually to reflect the correct master value for PUPPET_SERVER.  (Search for your slave's hostname in the http://hg.mozilla.org/build/puppet-manifests *production.pp files at the root of the repo)
'''Darwin Note''': remember to kill all instances of run-puppet-and-buildbot.sh script as it will be running with the refimage config and that will be overwriting your attempts to fix the puppet certs until you do
talos-r3-fed example (to help doing it):
<pre>
uname -a # to know if the hostname is correct and the FQDN
su - # switch to root
# on linux slave:
rm -rf /var/lib/puppet/ssl/certs/*
# on mac slave:
rm -rf /etc/puppet/ssl/certs/*
</pre>
<pre>
# on master
# you have to figure out the master depending on the datacenter the slave belongs to
puppetca --clean talos-r3-fed64-007.build.scl1.mozilla.com
</pre>
<pre>
# on slave
puppetd --test --server scl-production-puppet.build.scl1.mozilla.com
</pre>
<pre>
# on puppet master
puppetca --sign talos-r3-fed64-007.build.scl1.mozilla.com
</pre>
<pre>
# on slave
puppetd --test --server scl-production-puppet.build.scl1.mozilla.com
# wait few seconds and it should reboot
</pre>
* get the slave talking to puppet.  This will require a lot of repetitive work:
** adjust the master it talks to be appropriate to its location
*** NOTE: syncing against the correct masters (I believe) it adjusts these values.
*** Linux builder: /etc/sysconfig/puppet
*** Linux tester: /home/cltbld/.config/autostart/gnome-terminal.desktop
*** Mac: /Library/LaunchDaemons/com.reductiv*.plist
** run puppetd --test --noop --server $server_you_chose
*** If you get an error about directory not existing on linux, run without "<tt>--noop</tt>" once, so the directories can be created.
*** If you get an error that the slave can't access /N to access certain packages, update the fileserver.conf on the slave to ensure that that the subnet that the slave resides on is included in the list of subnets that can access /N.  Otherwise the slave won't be able to access the resources it needs from the /N directory served by Apache.
*** note that the scl server has a funny name!
*** if you see errors about certificates, remove the certificate files (''/var/lib/puppet/ssl/certs/*'', ''/var/puppet/ssl/certs/*'', or ''/etc/puppet/ssl/certs/*'', depending on the slave)
*** run ''puppetca --sign $slave_fqdn'' repeatedly on the appropriate puppet master. Cron runs it every 60 seconds, but waiting for the crontask just slows you down.
*** if told to, run ''puppetca --clear $slave_fqdn'' on the master - this occurs when the master has an old key for this slave
*** Note that if you see a successful run but nothing happens, you're probably talking to a master which has no configuration for this slave - check that you're talking to the right master, and that the master's site.pp file contains the slave's name, and try again.
** once puppet hits the right master, it will both blow away the certificates (even though they were correct) and reboot.  So you'll need to wait for a restart, log in, and go through the above process again.  Hopefully you'll only need to do this once.
* once puppet is done eviscerating itself, have a look at the slave's twistd.log.  If it's getting an UnauthorizedLogin for connection to the staging master, fix the password or add the slave to the master's config.  Otherwise, watch the staging master until the slave finishes a job.
== How to fix the hostname for Windows ==
Instead of replicating the information. Here are the instructions for all of our Windows platforms.
* right-click on 'My Computer', go to 'Properties', 'Computer Name'
* change the hostname
* the domain name should be ''build.mozilla.org''
** Otherwise, click 'Change', type the computer name, click 'More', type the domain, and click OK until it restarts.
Windows slaves will come back from a re-image with "talos-r3-xp-ref" or "talos-r3-w7-ref" as the hostname.
canmove, Confirmed users
2,850

edits

Navigation menu