Log in to post a comment.
Published on 4/25/21 at 7:50 PM.
Or how to bypass Docker Hub pull rate limits. And let's not forget about Docker-in-Docker !
Published on 4/25/21 at 7:50 PM.
Or how to bypass Docker Hub pull rate limits. And let's not forget about Docker-in-Docker !
Photo by Ludovic Charlet on Unsplash.
Like in my first post, I'm dealing with new restrictions: this time Docker Hub pull rate limit:
I too thought it would be enough until my pipeline stopped during a production deployment.
You won't encounter this issue if you used GitLab's shared runners, which I don't as I use my own runner.
The goal is to use my own Docker registry. For those who might not know what I'm talking about, that's where images are stored. There is Docker Hub, your local registry, but you can make your own if you want: there is a registry image !
But I won't simply create my own registry, I want it to mirror Docker Hub by acting as a "pull through cache":
For simplicity, the registry will be installed on the same server as the runner. You can of course separate them.
To be able to react to future limits and easily change my configuration, I created a GitLab repository with a pipeline dedicated to the GitLab Runner (which is completely optional).
The goal of the repository can be the updates to the GitLab Runner, you can save the updated config.toml there.
Also, since the pipeline may restart the runner, it should not run on it but on the shared runners.
For that you will need to go in Settings > CI/CD > Runners. In the Shared runners column, check « Enable shared runners for this project » and click on « Disable group runners ».
Note: performance-wise, the pipeline lasts about 30 seconds on average, nothing to worry about.
For the pipeline to be able to log into the server, we need to configure SSH keys.
To use the private key, we save it as a variable in Settings > CI/CD > Variables. I'll name it SSH_PRIVATE_KEY
.
Then we can start building the pipeline. Nothing special is needed, so I'm using a Alpine Linux image. Since it doesn't provide any SSH client, we have to install one:
image: alpine:3
variables:
RUNNER_IP: "123.1.2.3"
stages:
- deploy
deploy-runner:
stage: deploy
only:
- master
before_script:
- apk add --no-cache openssh-client
- mkdir -p ~/.ssh
- chmod 700 ~/.ssh
- eval $(ssh-agent -s)
- echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add -
We have a pipeline with a single job. Note that it only executes on the master
branch and that the runner's address is save in a variable for later.
The first step of the deploy-runner
job will be to update the runner's configuration file. If the file has changed, we update it and restart the runner:
deploy-runner:
...
script:
- cat ./config.toml | ssh user@$RUNNER_IP "cat > ./config.toml"
- ssh user@$RUNNER_IP "[[ ./config.toml -ef /etc/gitlab-runner/config.toml ]] || { mv ./config.toml /etc/gitlab-runner/config.toml && gitlab-runner restart; }"
Note: This is optional for the registry setup, you may not need to change your runner configuration.
It is mentionned in the multiple documentations I read that using Docker Hub credentials is useful to download private images. However I failed to make the registry work as expected without them. I advise you to create some.
First, create an account on Docker Hub or just log in.
Once logged in, go into the security section of the account settings: Account Settings > Security.
You will be able to create an access token by clicking « New Access Token ». Write a short description and keep the token for later.
The registry runs as a container. To make its management easy I chose to use Docker Compose which I installed on the host.
And here's the docker-compose.yml:
version: "3.7"
services:
registry:
image: registry:2.7
restart: always
ports:
- 5000:5000
volumes:
- "./registry-config.yml:/etc/docker/registry/config.yml"
We expose the 5000 port and set a volume for the registry configuration file.
The interesting part of the file is the proxy
key which we add after the default configuration (that you can find in the base image):
version: 0.1
log:
fields:
service: registry
storage:
cache:
blobdescriptor: inmemory
filesystem:
rootdirectory: /var/lib/registry
http:
addr: :5000
headers:
X-Content-Type-Options: [nosniff]
health:
storagedriver:
enabled: true
interval: 10s
threshold: 3
proxy:
remoteurl: https://registry-1.docker.io
username: USERNAME
password: DOCKER_HUB_ACCESS_TOKEN
The proxy.remoteurl
key will instruct the registry to act as a "pull through cache" mirroring Docker Hub.
Replace the proxy.username
value with your Docker Hub username, and proxy.password
with the access token you just created.
In our GitLab CI job, we add a few commands to start the registry:
...
deploy-runner:
...
script:
...
- cat ./docker-compose.yml | ssh user@$RUNNER_IP "cat > ./docker-compose.yml"
- cat ./registry-config.yml | ssh user@$RUNNER_IP "cat > ./registry-config.yml"
- ssh user@$RUNNER_IP "docker-compose up -d"
The docker-compose up
command starts or updates services, perfect for our needs.
We now need to instruct the Docker daemon to use our registry.
In a basic configuration using the shell executor or the docker executor with docker socket binding your target is the host daemon.
It is configurable with a JSON configuration file in which we'll set the mirror registry to use. We'll then need to restart the Docker daemon, so it is recommended to enable the Live Restore to avoid stopping started containers:
{
"registry-mirrors": ["http://123.1.2.3:5000"],
"live-restore": true
}
Note: Use the HTTP protocol if you haven't configured HTTPS, and don't forget the port number.
In the GitLab CI job we'll instruct to restart the Docker daemon if its configuration file has changed:
...
deploy-runner:
...
script:
...
- cat ./docker-daemon.json | ssh user@$RUNNER_IP "cat > ./docker-daemon.json"
- ssh user@$RUNNER_IP "[[ ./docker-daemon.json -ef /etc/docker/daemon.json ]] || { mv ./docker-daemon.json /etc/docker/daemon.json && systemctl reload docker; }"
In my case I use DinD to be able to use docker
commands inside the jobs. As DinD is configured as a pipeline service, we just need to add an option for it to use the registry.
It can be done in the runner's configuration file, or in the gitlab-ci.yml:
services:
- name: docker:20.10.6-dind
command: ["--registry-mirror", "http://123.1.2.3:5000"]
alias: docker
Be aware we're not modifying the runner's pipeline file but a project's file which pipelines will run on it.
As the registry usage is transparent, we can wonder if it works as expected. Running docker image ls
isn't sufficient as you won't know which registry has been used.
Download images with a docker pull
or by running a pipeline if your use DinD.
A registry can expose its images with an HTTP route, in JSON format:
curl http://123.1.2.3:5000/v2/_catalog
[output]{"repositories":["library/docker", "library/alpine"]}
If you find the images you used, it's all good !
Many things to do only to bypass a limitation, but it may be an opportunity to learn more about how Docker works. For now I can't tell if the registry improves the pipelines performance.
Anyway it allows to gain control over another part of the CI.