Adding new builder to our koji

Create new AMI from snapshot. It's running RHEL 6.2. Make
sure security groups is "launch-wizard-1". Make sure it has at least
one local storage and zone is us-east-1d (second can be swap). Note it takes some time (10 minutes or more) for
the initial boot (snapshot was not clean, fsck). Had to set AKI to
aki-1eceaf77 but this should be optional I think.

Edit /etc/hosts and edit entry for koji.katello.org - it must resolve to the internal IP address of the master instance.

Then stop kojid service, mkfs.ext4 on the local disk 1 and mount it:

/dev/xvdf1 on /mnt/tmp type ext4

Additionally enable swap on local disk 2 (preferred) and enable.

Create some directory structure on /mnt/tmp and symlinks

mkdir -p /mnt/tmp/var/{lib,tmp,cache} /mnt/tmp/var/lib/mock
chmod 777 /mnt/tmp/var/{lib,tmp,cache} /mnt/tmp/var/lib/mock
mkdir -p /mnt/tmp/external-repos
chmod g+ws /mnt/tmp/var/lib/mock
ln -s /mnt/tmp/var/tmp /var/tmp
ln -s /mnt/tmp/var/lib/mock /var/lib/mock
ln -sf /mnt/tmp/var/cache/yum /var/cache/yum

Make sure it has correct permissions.

Add the new builder via koji-admin tool and set's the capacity (4.00 for
m1.large).

Delete the RHUI stuff from /etc/yum.repos.d and
subscribe to updates via RHN CDN. Apply all security updates and reboot.
Take care - EPEL contains newer koji packages, DO NOT update koji from EPEL
(rather disable it).

Now you should be ready to start kojid, before that make sure that NFS
volumes are all mounted up (you will need to create the mountpoints):

koji.katello.org:/koji on /mnt/koji type nfs
koji.katello.org:/exports/koji/packages on /mnt/koji/packages type nfs
koji.katello.org:/repos on /mnt/koji/repos type nfs
koji.katello.org:/external-repos on /mnt/tmp/external-repos type nfs

Start kojid and watch /var/log/kojid.log.

TODO: Use FS-Cache/NFS cache to speed up NFS access: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/fscachenfs.html

Troubleshooting

Tasks are note being picked up

Koji builder (kojid) monitors system load and if it exceeds capacity set on koji master (you set it in postgres, defaults to 4) it does not start any tasks. The trick is to set capacity to high nubmer (e.g. 999) and set maxjobs in kojid.conf to amount of CPU cores + 1. Restart kojid and it will start picking things up.

Tasks are stuck in queue

We use NFS for several directories and NFS can easily get stuck when set to "hard" mode effectively blocking processes forever. Check NFS.

Beware, SELinux is enforcing so check for denials.