Docker Registry Mirrors

As part of SUSE Hackweek 17 I decided to work on afully fledged docker registry mirror.

Sep 28, 2016 Mirror 是一种代理中转服务,我们(指daocloud)提供的 Mirror 服务,直接对接 Docker Hub 的官方 Registry。 Docker Hub 上有数以十万计的各类 Docker 镜像。 在使用 Private Registry 时,需要在 Docker Pull 或 Dockerfile 中直接键入 Private Registry 的地址,通常这样会导致与 Private Registry. Jul 18, 2018 A docker registry configured to act as a pull-through cache can mirror only one registry. Which means that, if you are interested in mirroring both the Docker Hub and Quay.io, you will have to run two instances of docker registry pull-through caches: one for the Docker Hub, the other for Quay.io.

You might wonder why this is needed, after all it’s already possible to run adocker distribution (aka registry) instance as apull-through cache. Whilethat’s true, this solution doesn’t address the needs of more “sophisticated”users.

The problem

Based on the feedback we got from a lot of SUSE customers it’s clear that a simpleregistry configured to act as a pull-through cache isn’t enough.

Let’s go step by step to understand the requirements we have.

On-premise cache of container images

First of all it should be possible to have a mirror of certain container imageslocally. This is useful to save time and bandwidth. For example there’s noreason to download the same image over an over on each node of a Kubernetescluster.

A docker registry configured to act as a pull-through cache can help with that.There’s still need to warm the cache, this can be left to the organic pullof images done by the cluster or could be done artificially by some scriptrun by an operator.

Unfortunately a pull-through cache is not going to solve this problem fornodes running inside of an air-gapped environment. Nodes operated in such anenvironment are located into a completely segregated network, that would make itimpossible for the pull-through registry to reach the external registry.

Retain control over the contents of the mirror

Cluster operators want to have control of the images available inside of thelocal mirror.

For example, assuming we are mirroring the Docker Hub, an operator might befine with having the library/mariadb image but not the library/redis one.

When operating a registry configured as pull-through cache, all the images ofthe upstream registry are at the reach of all the users of the cluster. It’sjust a matter of doing a simple docker pull to get the image cached intothe local pull-through cache and sneak it into all the nodes.

Moreover some operators want to grant the privilege of adding images to the localmirror only to trusted users.

There’s life outside of the Docker Hub

The Docker Hub is certainly the most known container registry. However there arealso other registries being used: SUSE operates its own registry, there’s Quay.io,Google Container Registry (aka gcr) and there are even user operated ones.

A docker registry configured to act as a pull-through cache can mirror only oneregistry. Which means that, if you are interested in mirroring both the DockerHub and Quay.io, you will have to run two instances of docker registrypull-through caches: one for the Docker Hub, the other for Quay.io.

This is just overhead for the final operator.

A better solution

During the last week I worked to build a PoC to demonstrate we can create a dockerregistry mirror solution that can satisfy all the requirements above.

I wanted to have a single box running the entire solution and I wanted all thedifferent pieces of it to be containerized. I hence resorted to use anode powered by openSUSE Kubic.

I didn’t need all the different pieces of Kubernetes, I just needed kubelet sothat I could run it in disconnected mode. Disconnected means the kubeletprocess is not connected to a Kubernetes API server, instead it reads PODsmanifest files straight from a local directory.

The all-in-one box

I created an openSUSE Kubic node and then I started by deploying a standarddocker registry.This instance is not configured to act as a pull-through cache. However itis configured to use an external authorization service. This is needed to allowthe operator to have full control of who can push/pull/delete images.

I configured the registry POD to store the registry data to a directory on themachine by using a Kubernetes hostPathvolume.

On the very same node I deployed the authorization service needed by thedocker registry. I choose Portus, an open source solutioncreated at SUSEa long time ago.

Portus needs a database, hence I deployed a containerized instance of MariaDBon the same node. Again I used a Kubernetes hostPath to ensure the persistenceof the database contents. I placed both Portus and its MariaDB instance into thesame POD. I configured MariaDB to listen only to localhost, making it reachableonly by the Portus instance (that’s because they are in the sameKubernetes POD).

I configured both the registry and Portus to bind to a local unix socket,then I deployed a container running HAProxy to expose both of them tothe world.

The HAProxy is the only container that uses the host network. Meaning it’sactually listening on port 80 and port 443 of the openSUSE Kubic node.

I went ahead and created two new DNS entries inside of my local network:

  • registry.kube.lan: this is the FQDN of the registry
  • portus.kube.lan: this is the FQDN of portus

I configured both the names to be resolved with the IP address of my containerhost.

I then used cfssl to generate a CA andthen a pair of certificates and keys for registry.kube.lan and portus.kube.lan.

Finally I configured HAProxy to:

  • Listen on port 80 and 443.
  • Automatically redirect traffic from port 80 to port 443.
  • Perform TLS termination for registry and Portus.
  • Load balance requests against the right unix socket usingthe Server Name Indication (SNI).

By having dedicated FQDN for the registry and Portus and by using HAProxy’s SNIbased load balancing, we can leave the registry listening on a standard port(443) instead of using a different one (eg: 5000). In my opinion that’s a bigwin, based on my personal experience having the registry listen on a non standardport makes things more confusing both for the operators and the end users.

Once I was over with these steps I was able to log into https://portus.kube.lanand perform the usual setup wizard of Portus.

Mirroring images

We now have to mirror images from multiple registries into the local one, buthow can we do that?

Sometimes ago I stumbled over this tool,which can be used to copy images from multiple registries into a single one.While doing that it can change the namespace of the image to put it all theimages coming from a certain registry into a specific namespace.

I wanted to use this tool, but I realized it relies on the docker open-sourceengine to perform the pull and push operations. That’s a blocking issue for mebecause I wanted to run the mirroring tool into a container without doing nastytricks like mounting the docker socket of the host into a container.

Basically I wanted the mirroring tool to not rely on the docker open sourceengine.

At SUSE we are already using and contributing toskopeo, an amazing toolthat allows interactions with container images and container registries withoutrequiring any docker daemon.

The solution was clear: extend skopeo to provide mirroring capabilities.

I drafted a design proposal with my colleague Marco Vedovati,started coding and then ended up with this pull request.

While working on that I also uncovered a small glitchinside of the containers/image library used by skopeo.

Using a patched skopeo binary (which include both the patches above) I thenmirrored a bunch of images into my local registry:

The first command mirrored only the busybox:musl container image from theDocker Hub to my local registry, while the second command mirrored all thecoreos/etcd images from the quay.io registry to my local registry.

Since the local registry is protected by Portus I had to specify my credentialswhile performing the sync operation.

Running multiple sync commands is not really practical, that’s why we addeda source-file flag. That allows an operator to write a configuration fileindicating the images to mirror. More on that on a dedicated blog post.

At this point my local registry had the following images:

  • docker.io/busybox:musl
  • quay.io/coreos/etcd:v3.1
  • quay.io/coreos/etcd:latest
  • quay.io/coreos/etcd:v3.3
  • quay.io/coreos/etcd:v3.3
  • … more quay.io/coreos/etcd images …

As you can see the namespace of the mirrored images is changed toinclude the FQDN of the registry from which they have been downloaded.This avoids clashes between the images and makes easier to track theirorigin.

Mirroring on air-gapped environments

As I mentioned above I wanted to provide a solution that could be used alsoto run mirrors inside of air-gapped environments.

The only tricky part for such a scenario is how to get the images from theupstream registries into the local one.

This can be done in two steps by using the skopeo sync command.

We start by downloading the images on a machine that is connected to the internet.But instead of storing the images into a local registry we put them on a localdirectory:

This is going to copy all the versions of the quay.io/coreos/etcd image intoa local directory /media/usb-disk/mirrored-images.

Let’s assume /media/usb-disk is the mount point of an external USB drive.We can then unmount the USB drive, scan its contents with some tool, andplug it into computer of the air-gapped network. From this computer wecan populate the local registry mirror by using the following command:

This will automatically import all the images that have been previously downloadedto the external USB drive.

Pulling the images

Now that we have all our images mirrored it’s time to start consuming them.

It might be tempting to just update all our Dockerfile(s), Kubernetesmanifests, Helm charts, automation scripts, …to reference the images from registry.kube.lan/<upstream registry FQDN>/<image>:<tag>.This however would be tedious and unpractical.

As you might know the docker open source engine has a --registry-mirror.Unfortunately the docker open source engine can only be configured to mirror theDocker Hub, other external registries are not handled.

This annoying limitation lead me and Valentin Rothbergto create this pull requestagainst the Moby project.

Valentin is also porting the patch against libpod,that will allow to have the same feature also inside ofCRI-O and podman.

During my experiments I figured somelittle bitswere missing from the original PR.

I built a docker engine with the full patchapplied and I created this /etc/docker/daemon.json configuration file:

Then, on this node, I was able to issue commands like:

That resulted in the image being downloaded from registry.kube.lan/quay.io/coreos/etcd:v3.1,no communication was done against quay.io. Success!

Registry

What about unpatched docker engines/other container engines?

Everything is working fine on nodes that are running this not-yet merged patch,but what about vanilla versions of docker or other container engines?

I think I have a solution for them as well, I’m going to experiment a bitwith that during the next week and then provide an update.

Docker Registry Mirrors For Sale

Show me the code!

Docker Registry Api

This is a really long blog post. I’ll create a new one with all the configurationfiles and instructions of the steps I performed. Stay tuned!

Docker Registry-mirrors Authentication

In the meantime I would like to thankMarco Vedovati,Valentin Rothberg for their help with skopeo andthe docker mirroring patch, plus Miquel Sabaté Solàfor his help with Portus.