Part 1 of the series Yum Repositories for Corporate Deployments.

Before we talk about how to set up a new yum repository, we need to discuss why, so that we can set up the right kind of repository. Before we can discuss why to set up a new repository, we need to know what exactly can be set up.



A yum repository is a collection of rpm packages explained by a series of metadata databases. These databases are described in a XML file called “repomd.xml”, and all of this metadata is created automatically by a package appropriately named “createrepo.”


yum is a client (meaning it reaches out to another server) tool to discover software availability from repositories and, if instructed, install that software onto the rpm-compatible GNU/Linux computer. yum uses the rpm database and format to determine dependencies, then handles those dependencies automatically.  For example, if a piece of software requires “python” and python is not installed, yum will find the python package and install it at its latest version, from any of the repositories it has been told of from its repo files.


A repo file is a description to the yum client about where to find a repository, and what to do with the data found therein.  A repo file can describe multiple repositories, that are generally related.  Each repository definition starts with the repository name, and will include a path to the repository (the exported directory on the server which contains the “repodata” metadata directory that describes the repository), the user-friendly name of the repository, and some information about security (gpg signing and protection status).

 Background – Base Repositories

A default Red Hat Enterprise Linux (RHEL, even though Red Hat doesn’t like it) or CentOS or Scientific Linux installation ships with one or several repo files pointing to several different repositories.  Understanding these base repositories helps us decide what we want to set up inside our corporate network.  There are at least 2 default repositories in that configuration, “base”, “updates”.  CentOS and Scientific Linux also ship “extras”, “plus”, and CentOS ships “contrib”.


The “base” repository is the installation copy.  If you’re building a “base” repository in your corporate environment, you simply copy the entire DVD image to the repository location and leave it.  “base” is used to build an initial system to a known state.


Updates is the repository where rpm updates for security, etc. are actually updated.  If you’re building an “updates” repository in your corporate network, you need to sync it from another repository synced to the {RHEL,CentOS,Scientific} original repository.  If you use the correct rsync command, you *only* need to rsync the repository (more in part 2).

extras, plus, and contrib

The best explanations of these repositories for CentOS come from the CentOS team themselves here:

Repository basics

At this point, the repository names are just names.  Only “base” and “updates” have any real meaning – “base” is for the CD install, and “updates” is for security/other updates to “base”.  Any other repository is named so that the client system administrator can understand what might be installed from that repository.

The CentOS Team has some great information, which I’m lifting from :

  • Use of hard-coded version and architecture: ‘baseurl=http: //’ This hard codes both for ‘$releasever’ and ‘$basearch’. Compare this, to the more proper: ‘baseurl=http: //$releasever/en/$basearch/dag’. The ‘hard coded’ approach limits it to only be ‘correct’ for CentOS 4 on an i386 platform.
  • Mixing Fedora repositories with CentOS oriented repositories: Look for ‘name=Fedora’, vs. ‘name=CentOS.(whatever)’. Fedora repositories are not likely to be compatible with CentOS. Repositories for other Enterprise Linux distros derived from the same upstream sources are more likely to be compatible, but should still be used with care.

The same rules will apply to building a corporate yum repository.  For building a corporate repository, I would also add: It’s easier to create more repositories, rather than trying to merge existing ones.  If you are syncing parts of repositories, they should be separate repositories, synced as a single unit.  Yum’s configuration is able to handle many repo files with many repository definitions, as long as those repositories don’t install the same files with conflicting versions.


Continue to Part 2: Setting up a corporate yum repository mirror for bandwidth and staged update management

Return to the series header: Yum Repositories for Corporate Deployments