Mirroring the Docker Hub

Some recent work in maintaining the official Perl image for Docker lead me into pulling in different copies and tags of buildpack-deps more than once, in different machines; as a result, I used a lot of network bandwidth for image pulling. Aside from this, I also run several VMs via libvirt/KVM, via Docker Machine to start up VMs for Docker Swarm and Minikube.

I wanted to save my bandwidth usage, and it turns out there is a way to do this by building on some Docker features...

Using "remote" Docker daemon environment for the Docker client

One approach is to not use your local (bare-metal) Docker context at all, and use a Docker Machine's environment instead:


$ docker-machine create my-remote-machine
$ eval $(docker-machine env my-remote-machine)
$ docker info 

Minikube (which builds on docker-machine) also encourages this approach:


$ minikube start
$ eval $(minikube docker-env)

Either setup will change your shell's environment to use a "remote" Docker server, which means docker pulls will put the images in the remote's own storage. This is good in cases where you retain a single machine or minikube instance, but should you delete the instance, you'd have to re-pull again.

Pipe images from local to remote via SSH

Another approach, especially for Minikube, is to use an SSH pipe to send images from local to remote, like this:


$ docker pull myimage:4.2
$ docker save myimage:4.2 | minikube ssh docker load

Again, this borrows from docker-machine:


$ docker save myimage:4.2 | docker-machine ssh my-remote-machine docker load

It is good enough for testing local private images in a swarm, but can get tedious real fast.

Running a pull-through Docker Registry

Fortunately, Docker's basic architecture already provides a way to reduce the tedium by means of the Docker Registry. One of the Registry's features is that it can run as a proxy (pull-through cache) of the Docker Index. One can thus start a persistent container like this:


$ docker run -e REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io -e REGISTRY_STORAGE_DELETE_ENABLED=true -d -p 50000:5000 --restart=always --name registry registry:2

This lets me run a Registry proxy on localhost:50000, which I can then point Docker daemons to via the registry-mirrors configuration, e.g. in /etc/docker/daemon.json:


{
  "registry-mirrors: ["http://127.0.0.1:50000"]
}

There are a few caveats with it though:

  • The proxy Registry will always fetch images with unqualified tags such as docker pull myimage, translating them into requests for docker pull myimage:latest; thus, it is preferable to use named tags on images so that the proxy Registry will only update its local image copy when there are updates from the remote Docker Index. The SSH pipe trick would be preferable in this case.
  • The proxy Registry only works with the Docker Index/Central Hub and not with private registries.
  • You can't push local images into the proxy Registry; instead you can push to Docker Hub or to another/private registry, or just run another local Registry that accept pushes, but for now I'm fine with this.

Putting it all together

Making this work on docker-machine and minikube is a matter of passing in certain flags. In my case, the registry container also listens on my libvirt/KVM default network bridge (192.168.122.1) thus I have to do


$ minikube start --registry-mirror=http://192.168.122.1:50000

for minikube. For docker-machine, it looks like this:


$ docker-machine create --engine-registry-mirror=http://192.168.122.1:5000 my-docker-machine