Part 2 of the seriesYum Repositories for Corporate Deployments

Part 1 of this series is: Yum Repositories: Background, Terminology and basics.

The most basic setup of a corporate yum repository would be to build a “base” repository for building centralized installs of systems off of the netinstall CD/USB stick provided by your Linux distro. The next basic repository to set up would be an “updates” repository, synced back to the distro, for bandwidth management for scheduled updates. We’ll walk through these 2 setups in order.

Setting up the repository host server

The first thing to determine is *how* you’re going to host your repository.  HTTP is probably the most common, but FTP and NFS repositories can be built as well.  The additional work for an NFS repository is minimal, so we’ll cover that as well as HTTP, using apache 2.2 as the HTTP server.

First, install the NFS and HTTP server components as well as the required reposync utility (this information is exactly the same for an NFS or HTTP apt repository for Ubuntu – a post for another time) and set them to autostart:
sudo /usr/bin/yum install nfs httpd yum-utils
sudo /sbin/chkconfig on http
sudo /sbin/chkconfig on nfs

Next set up the apache server to share the repository. This requires deciding *where* to host the repository, so we’ll use /srv/repo as our base. Create a file /etc/httpd/conf.d/repo.conf with these contents:

Alias /centos /srv/repo

<directory /srv/repo>
Options +Indexes
AllowOverride None
order allow,deny
allow from all
</directory>

And now share out that directory via NFS and restart the servers:

echo '/srv/repo *(ro,async)' >> /etc/exports
service nfs restart
service httpd restart

The last suggestion I’d make is to create a subdirectory for each OS major version and its architectures in the root of your repository heirarchy, which is simple with bash expansion:

mkdir -p /srv/repo/{5,6}

Setting up a “base” repository

This is as simple as copying the contents of the install DVDs or CDs into the repository. We’ll use the CentOS materials for these examples, but Red Hat Enterprise Linux and Scientific Linux work the same.
First, we need to create our repository structure. This walkthrough will only demo RHEL 6 derivatives, but 5.x will work the same. This is the first time to think about which repositories to build. I’ll create 3 total: “base”, “updates” and “internal”. “base” and “updates” will be the distrobution-provided repositories, and “internal” will be for our company’s internal software. The “base” repository actually lives in a folder called “os”, and each repository has subfolders for each architecture that it supports. Therefore, we need to create 3 directories in /srv/repo/6: “os”, “updates”, and “internal”, and each of those gets 2 subdirectories for our supported architectures: “i386” and “x86_64”. (Also simple with bash expansion.)

mkdir -p /srv/repo/6/{os,updates,internal}/{i386,x86_64}

Now copy the files:

sudo mount -t auto -o loop,ro CentOS-6.3-x86_64-bin-DVD1.iso /mnt/cd1/
cp -R /mnt/cd1/* /srv/repo/6/os/x86_64/
sudo umount /mnt/cd1
sudo mount -t auto -o loop,ro CentOS-6.3-x86_64-bin-DVD2.iso /mnt/cd1/
cp -R /mnt/cd1/Packages/* /srv/repo/6/os/x86_64/Packages/

If you have SELinux enabled, and you should, you’ll need to ensure that Apache’s httpd daemon can read these files with the following:

chcon -Rv --type=httpd_sys_content_t /srv/repo

This will set the SELinux context of the directories and all subfolders to “httpd_sys_content_t”, which is the context httpd can read.
At this point, if your server is named “build” you can do an install from the netinstall CD, and point it to “http://build/centos/6/os/$basearch” as the install point, and all packages will be installed from that location. If you do this “base” install with the 6.4 sources, when 6.5 comes out, read the Changing “base” Warning below.
A point of note, for people who come here confused – I haven’t mentioned “reposync” or “createrepo” tools yet – those will be used much later, or not at all. For the “base” repository, all of the repository metadata is included in the DVD copy.

Setting up the “updates” repository

The updates repository needs to be updated constantly, or it becomes meaningless This is the default channel by which the distribution provides security and bug fixes for the software packages they ship. There are 2 means for synchronizing a remote repository: reposync and rsync. Since rsync is more widely used and easier for many people to understand, and because the benefits of reposync aren’t easy for me to find, we’ll use rsync.

The “updates” repository is large – the 64-bit x86_64 “updates” repository in our lab is currently 14GB, and the “os” repository is 5.6. Each additional architecture and version supported will have similar size requirements.

If you’ve already set up the initial structure from the instructions for the “base” repository, skip this paragraph: First, we need to create our repository structure. This walkthrough will only demo RHEL 6 derivatives, but 5.x will work the same. This is the first time to think about which repositories to build. I’ll create 3 total: “base”, “updates” and “internal”. “base” and “updates” will be the distrobution-provided repositories, and “internal” will be for our company’s internal software. The “base” repository actually lives in a folder called “os”, and each repository has subfolders for each architecture that it supports. Therefore, we need to create 3 directories in /srv/repo/6: “os”, “updates”, and “internal”, and each of those gets 2 subdirectories for our supported architectures: “i386” and “x86_64”. (Also simple with bash expansion.)

mkdir -p /srv/repo/6/{os,updates,internal}/{i386,x86_64}

SKIP TO HERE.
Now to pull down the repository updates from a mirror. First choose a mirror from the list for your distribution: CentOS, Scientific Linux, or RHEL (ask your support team).
To make sure your “updates” repository doesn’t go out of sync, schedule a cron job, by creating a file in /etc/cron.daily/ or in /etc/cron.hourly/ (See this warning before syncing hourly), called “reposync” and pasting the rsync commands into it (here’s ours):

#!/bin/sh

mirror="mirror.steadfast.net/centos"
repobase="/repo/yum/"

for vers in 5 6; do

rsync -art rsync://${mirror}/${vers}/os/x86_64 ${repobase}/${vers}/os/
if [ $? -ne 0 ]; then
echo "ERROR getting os files from ${mirror} for ${vers} x64, quitting."
exit 1
fi
rsync -art rsync://${mirror}/${vers}/updates/x86_64 ${repobase}/${vers}/updates/
if [ $? -ne 0 ]; then
echo "ERROR getting updates files from ${mirror} for ${vers} x64, quitting."
exit 1
fi
if [ $vers -eq 5 ]; then
rsync -art rsync://${mirror}/${vers}/os/i386 ${repobase}/${vers}/os/
if [ $? -ne 0 ]; then
echo "ERROR getting os files from ${mirror} for ${vers} i386, quitting."
exit 1
fi
rsync -art rsync://${mirror}/${vers}/updates/i386 ${repobase}/${vers}/updates/
if [ $? -ne 0 ]; then
echo "ERROR getting updates files from ${mirror} for ${vers}, quitting."
exit 1
fi
fi
done

The first rsync took about 2 hours, but subsequent syncs are much faster, since rsync can skip existing files. You’ll notice 2 things: 1) that we have disabled syncing the 32-bit CentOS 6 updates; 2) that we also rsync the “os” repo – for more on this, read the Changing “base” Warning below. Our lab doesn’t have or test 32-bit CentOS 6, and we always test “latest” releases. Also, I set the particular mirror to be a variable, so that it would be easy to change if required.

Setting clients to use your repositories

The easiest way to set a single client to use your newly-setup repo is to simply edit the /etc/yum.repos.d/*-Base.repo file by hand, but that’s slow. I put a fixed repo file in the root of the webserver (in /repo/yum/, named CentOS-Base.repo. In it, I’ve simply commented out the mirror line, and replaced it with a local baseurl:

[base]
name=CentOS-$releasever - Base
#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os
#baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/
baseurl=http://build/centos/$releasever/os/$basearch
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-$releasever

#released updates
[updates]
name=CentOS-$releasever - Updates
#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=updates
#baseurl=http://mirror.centos.org/centos/$releasever/updates/$basearch/
baseurl=http://build/centos/$releasever/updates/$basearch
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-$releasever
enabled=1


This baseurl doesn’t have to be HTTP, since it is a local server and you installed NFS, in that case, use autofs and the file:// URI as:

baseurl=file:///net/build/srv/repo/$releasever/os/$basearch

I haven’t found information that says conclusively if $releasever is properly expanded in that line, but testing shows it works fine in our lab.

Now to test it out on your CentOS 6 client:

wget -O /etc/yum.repos.d/CentOS-Base.repo http://build/CentOS-Base.repo.6
yum clean all
yum upgrade

Now all your yum installs and searches will come from the local repo.

Changing the “base” Source


This is an issue we ran into during testing of this post – you’ll notice all of the rsync scripts pull from “centos/6” not “centos/6.4”, and that we rsync both “updates” and “os” repos nightly.

Which “base” repo you sync against matters. In testing, we initially had the “updates” repo syncing nightly, but not the “os” repo (which was originally staged from a 6.3 DVD. When CentOS 6.4 was released, our “updates” repo shrunk considerably, and we started receiving “Error: Protected multilib versions:” on all updates. This was because “updates” are updates applied against “base”, so they both must be in sync. Therefore, either set up all your systems to sync “6.4” and have no problems, or sync both “base”/”os” and “updates” simultaneously, at the cost of more bandwidth and storage usage.

If you do keep “base” tracking the latest release, your previously-installed systems will continue to upgrade properly – they just will become CentOS 6.4 if they were previously CentOS 6.3. BUT, PXE booting will break. From my colleague:

If you use PXE boot installs, you need to pay attention to when your “base” repo updates so you can copy the appropriate vmlinux and initrd images into the pxe directories.

Staging internally

And a warning about houly syncing.
Most public mirrors will probably blacklist you if you sync hourly. Don’t do that. However, you can set up multiple internal servers, where SERVERA pulls from a public mirror daily, and SERVERB pulls from SERVERA, or even an intermediary. For Example, SERVERA may be the update server your dev lab uses. When patches are verified in dev, you can push updates from SERVERA to a staging location that SERVERB (an SERVERC, etc.) would rsync from, so that production could be “released” to install those patches.

Again: Don’t do hourly syncs to public mirrors.

Series Links

Return to the series header Yum Repositories for Corporate Deployments.

Continue to Part 3 (TBD)

Part 1 of the series Yum Repositories for Corporate Deployments.

Before we talk about how to set up a new yum repository, we need to discuss why, so that we can set up the right kind of repository. Before we can discuss why to set up a new repository, we need to know what exactly can be set up.

Terminology

Repository

A yum repository is a collection of rpm packages explained by a series of metadata databases. These databases are described in a XML file called “repomd.xml”, and all of this metadata is created automatically by a package appropriately named “createrepo.”

yum

yum is a client (meaning it reaches out to another server) tool to discover software availability from repositories and, if instructed, install that software onto the rpm-compatible GNU/Linux computer. yum uses the rpm database and format to determine dependencies, then handles those dependencies automatically.  For example, if a piece of software requires “python” and python is not installed, yum will find the python package and install it at its latest version, from any of the repositories it has been told of from its repo files.

repo

A repo file is a description to the yum client about where to find a repository, and what to do with the data found therein.  A repo file can describe multiple repositories, that are generally related.  Each repository definition starts with the repository name, and will include a path to the repository (the exported directory on the server which contains the “repodata” metadata directory that describes the repository), the user-friendly name of the repository, and some information about security (gpg signing and protection status).

 Background – Base Repositories

A default Red Hat Enterprise Linux (RHEL, even though Red Hat doesn’t like it) or CentOS or Scientific Linux installation ships with one or several repo files pointing to several different repositories.  Understanding these base repositories helps us decide what we want to set up inside our corporate network.  There are at least 2 default repositories in that configuration, “base”, “updates”.  CentOS and Scientific Linux also ship “extras”, “plus”, and CentOS ships “contrib”.

base

The “base” repository is the installation copy.  If you’re building a “base” repository in your corporate environment, you simply copy the entire DVD image to the repository location and leave it.  “base” is used to build an initial system to a known state.

updates

Updates is the repository where rpm updates for security, etc. are actually updated.  If you’re building an “updates” repository in your corporate network, you need to sync it from another repository synced to the {RHEL,CentOS,Scientific} original repository.  If you use the correct rsync command, you *only* need to rsync the repository (more in part 2).

extras, plus, and contrib

The best explanations of these repositories for CentOS come from the CentOS team themselves here: http://wiki.centos.org/AdditionalResources/Repositories

Repository basics

At this point, the repository names are just names.  Only “base” and “updates” have any real meaning – “base” is for the CD install, and “updates” is for security/other updates to “base”.  Any other repository is named so that the client system administrator can understand what might be installed from that repository.

The CentOS Team has some great information, which I’m lifting from http://wiki.centos.org/AdditionalResources/Repositories :

  • Use of hard-coded version and architecture: ‘baseurl=http: //ftp.belnet.be/packages/dries.ulyssis.org/redhat/el4/en/i386/dries/RPMS’ This hard codes both for ‘$releasever’ and ‘$basearch’. Compare this, to the more proper: ‘baseurl=http: //apt.sw.be/redhat/el$releasever/en/$basearch/dag’. The ‘hard coded’ approach limits it to only be ‘correct’ for CentOS 4 on an i386 platform.
  • Mixing Fedora repositories with CentOS oriented repositories: Look for ‘name=Fedora’, vs. ‘name=CentOS.(whatever)’. Fedora repositories are not likely to be compatible with CentOS. Repositories for other Enterprise Linux distros derived from the same upstream sources are more likely to be compatible, but should still be used with care.

The same rules will apply to building a corporate yum repository.  For building a corporate repository, I would also add: It’s easier to create more repositories, rather than trying to merge existing ones.  If you are syncing parts of repositories, they should be separate repositories, synced as a single unit.  Yum’s configuration is able to handle many repo files with many repository definitions, as long as those repositories don’t install the same files with conflicting versions.

Series

Continue to Part 2: Setting up a corporate yum repository mirror for bandwidth and staged update management

Return to the series header: Yum Repositories for Corporate Deployments

I was tasked with a documentation project regarding use of yum repositories inside a customer’s environment and realized that I was having a hard time finding exactly the information I needed to build the full test environment required to test the documentation.  What will follow is a series of posts covering information I had to research as background.

Topics

  1. Background, Terminology and repository basics
  2. Setting up a corporate yum repository mirror for bandwidth and staged update management
  3. Setting up a private repository for additional software

Notes

As a quick note up front, based on what I’ve learned, I can’t find a good reason to include the software from a private repository in the update management repository – just build a new repository.

PXE Booting – there are instructions here which will break a PXE Boot environment. Our lab doesn’t PXE or Net boot, so we haven’t tested against them, but a colleague assures me it won’t work.

Bibliography

The following posts were used to build this series. None of them was 100% what I needed, and no source had everything I needed in a single location, but they were all useful, and in no order at all:
http://www.howtoforge.com/creating_a_local_yum_repository_centos
http://wiki.centos.org/HowTos/CreateLocalMirror
http://kenfallon.com/how-to-mirror-rhn-behind-your-firewall/
https://docs.fedoraproject.org/en-US/Fedora/14/html/Software_Management_Guide/ch08s04.html
http://www.outsidaz.org/blog/2012/02/19/using-reposync-to-provide-local-yumrpm-mirrors/