gitlab Archives - Code Review Videos

How I Fixed: Error response from daemon: Get https://registry.example.com/v2/: unauthorized: HTTP Basic: Access denied

OK – silly problem time.

A while back I force reset the password of one of my automated CI users. For a variety of reasons, I never checked that this had worked properly.

When I went to log in via the command line today, I was getting this:

➜  docker login -u myuser registry.example.com
Password: 
Error response from daemon: Get https://registry.example.com/v2/: unauthorized: HTTP Basic: Access denied

Very confusing.

I hard reset the user’s password via the GitLab Admin Panel, but still the problem persisted.

Simple fix: log in as this user via the web GUI.

Once you do that, you should see the password change prompt. Change your password there, and et voila, you can now login from the command line again.

It would be useful if the service offered a better message around this occurrence, but I’m guessing it’s a bit of a weird edge case. I’m actually not sure if the issue lies with GitLab or the Docker Registry image honestly.

Either way, hopefully that solves your problem.

How I Solved: Cannot connect to the Docker daemon at tcp://dind:2375. Is the docker daemon running?

OK, tl;dr, this is not a true fix. However, it works. Or worked for me.

The issue I have been facing, the one that has cost me my entire Saturday morning, is this:

➜  gitlab-ci docker-compose up 
Creating network "gitlab-ci_default" with the default driver
Creating gitlab-ci_runner_1_28ccd2f6e08d          ... done
Creating gitlab-ci_register-runner_1_6ddb7e90a9d3 ... done
Creating gitlab-ci_dind_1_bb210df194a2            ... done
Attaching to gitlab-ci_runner_1_3cb60d519ae8, gitlab-ci_register-runner_1_941db09830b5, gitlab-ci_dind_1_a0ef0b8a29e4
runner_1_3cb60d519ae8 | Runtime platform                                    arch=amd64 os=linux pid=7 revision=61e7606f version=14.1.0~beta.182.g61e7606f
runner_1_3cb60d519ae8 | Starting multi-runner from /etc/gitlab-runner/config.toml...  builds=0
runner_1_3cb60d519ae8 | Running in system-mode.                            
runner_1_3cb60d519ae8 |                                                    
runner_1_3cb60d519ae8 | Configuration loaded                                builds=0
runner_1_3cb60d519ae8 | listen_address not defined, metrics &amp; debug endpoints disabled  builds=0
runner_1_3cb60d519ae8 | &#91;session_server].listen_address not defined, session endpoints disabled  builds=0
register-runner_1_941db09830b5 | Runtime platform                                    arch=amd64 os=linux pid=7 revision=61e7606f version=14.1.0~beta.182.g61e7606f
dind_1_a0ef0b8a29e4 | Generating RSA private key, 4096 bit long modulus (2 primes)
register-runner_1_941db09830b5 | Running in system-mode.                            
register-runner_1_941db09830b5 |                                                    
register-runner_1_941db09830b5 | Registering runner... succeeded                     runner=cjX3zQG_
register-runner_1_941db09830b5 | Runner registered successfully. Feel free to start it, but if it's running already the config should be automatically reloaded! 
gitlab-ci_register-runner_1_941db09830b5 exited with code 0
dind_1_a0ef0b8a29e4 | .............................++++
dind_1_a0ef0b8a29e4 | ...........................................................................................................................................................++++
dind_1_a0ef0b8a29e4 | e is 65537 (0x010001)
dind_1_a0ef0b8a29e4 | Generating RSA private key, 4096 bit long modulus (2 primes)
runner_1_3cb60d519ae8 | Configuration loaded                                builds=0
dind_1_a0ef0b8a29e4 | .......++++
dind_1_a0ef0b8a29e4 | .............++++
dind_1_a0ef0b8a29e4 | e is 65537 (0x010001)
dind_1_a0ef0b8a29e4 | Signature ok
dind_1_a0ef0b8a29e4 | subject=CN = docker:dind server
dind_1_a0ef0b8a29e4 | Getting CA Private Key
dind_1_a0ef0b8a29e4 | /certs/server/cert.pem: OK
dind_1_a0ef0b8a29e4 | Generating RSA private key, 4096 bit long modulus (2 primes)
dind_1_a0ef0b8a29e4 | ................................................................++++
dind_1_a0ef0b8a29e4 | ...................................................................................................................++++
dind_1_a0ef0b8a29e4 | e is 65537 (0x010001)
dind_1_a0ef0b8a29e4 | Signature ok
dind_1_a0ef0b8a29e4 | subject=CN = docker:dind client
dind_1_a0ef0b8a29e4 | Getting CA Private Key
dind_1_a0ef0b8a29e4 | /certs/client/cert.pem: OK
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.193084271Z" level=info msg="Starting up"
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.194256426Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.198038066Z" level=info msg="libcontainerd: started new containerd process" pid=53
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.198088131Z" level=info msg="parsed scheme: "unix"" module=grpc
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.198099583Z" level=info msg="scheme "unix" not registered, fallback to default scheme" module=grpc
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.198127773Z" level=info msg="ccResolverWrapper: sending update to cc: {&#91;{unix:///var/run/docker/containerd/containerd.sock  &lt;nil&gt; 0 &lt;nil&gt;}] &lt;nil&gt; &lt;nil&gt;}" module=grpc
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.198154133Z" level=info msg="ClientConn switching balancer to "pick_first"" module=grpc
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.211367108Z" level=info msg="starting containerd" revision=d71fcd7d8303cbf684402823e425e9dd2e99285d version=v1.4.6
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.236199982Z" level=info msg="loading plugin "io.containerd.content.v1.content"..." type=io.containerd.content.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.236321650Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.aufs"..." type=io.containerd.snapshotter.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.240984040Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.btrfs"..." type=io.containerd.snapshotter.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.241236268Z" level=info msg="skip loading plugin "io.containerd.snapshotter.v1.btrfs"..." error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs (ext4) must be a btrfs filesystem to be used with the btrfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.241270029Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.devmapper"..." type=io.containerd.snapshotter.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.241295950Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.devmapper" error="devmapper not configured"
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.241311248Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.native"..." type=io.containerd.snapshotter.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.241375982Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.overlayfs"..." type=io.containerd.snapshotter.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.241537098Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.zfs"..." type=io.containerd.snapshotter.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.241730231Z" level=info msg="skip loading plugin "io.containerd.snapshotter.v1.zfs"..." error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.241748382Z" level=info msg="loading plugin "io.containerd.metadata.v1.bolt"..." type=io.containerd.metadata.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.241783028Z" level=warning msg="could not use snapshotter devmapper in metadata plugin" error="devmapper not configured"
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.241794800Z" level=info msg="metadata content store policy set" policy=shared
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.262730870Z" level=info msg="loading plugin "io.containerd.differ.v1.walking"..." type=io.containerd.differ.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.262764458Z" level=info msg="loading plugin "io.containerd.gc.v1.scheduler"..." type=io.containerd.gc.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.262808206Z" level=info msg="loading plugin "io.containerd.service.v1.introspection-service"..." type=io.containerd.service.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.262843956Z" level=info msg="loading plugin "io.containerd.service.v1.containers-service"..." type=io.containerd.service.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.262862664Z" level=info msg="loading plugin "io.containerd.service.v1.content-service"..." type=io.containerd.service.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.262875776Z" level=info msg="loading plugin "io.containerd.service.v1.diff-service"..." type=io.containerd.service.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.262889612Z" level=info msg="loading plugin "io.containerd.service.v1.images-service"..." type=io.containerd.service.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.262902664Z" level=info msg="loading plugin "io.containerd.service.v1.leases-service"..." type=io.containerd.service.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.262923290Z" level=info msg="loading plugin "io.containerd.service.v1.namespaces-service"..." type=io.containerd.service.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.262946006Z" level=info msg="loading plugin "io.containerd.service.v1.snapshots-service"..." type=io.containerd.service.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.262959263Z" level=info msg="loading plugin "io.containerd.runtime.v1.linux"..." type=io.containerd.runtime.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263113401Z" level=info msg="loading plugin "io.containerd.runtime.v2.task"..." type=io.containerd.runtime.v2
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263225410Z" level=info msg="loading plugin "io.containerd.monitor.v1.cgroups"..." type=io.containerd.monitor.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263552551Z" level=info msg="loading plugin "io.containerd.service.v1.tasks-service"..." type=io.containerd.service.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263577871Z" level=info msg="loading plugin "io.containerd.internal.v1.restart"..." type=io.containerd.internal.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263616499Z" level=info msg="loading plugin "io.containerd.grpc.v1.containers"..." type=io.containerd.grpc.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263632647Z" level=info msg="loading plugin "io.containerd.grpc.v1.content"..." type=io.containerd.grpc.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263648459Z" level=info msg="loading plugin "io.containerd.grpc.v1.diff"..." type=io.containerd.grpc.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263661513Z" level=info msg="loading plugin "io.containerd.grpc.v1.events"..." type=io.containerd.grpc.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263674151Z" level=info msg="loading plugin "io.containerd.grpc.v1.healthcheck"..." type=io.containerd.grpc.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263687079Z" level=info msg="loading plugin "io.containerd.grpc.v1.images"..." type=io.containerd.grpc.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263699853Z" level=info msg="loading plugin "io.containerd.grpc.v1.leases"..." type=io.containerd.grpc.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263789801Z" level=info msg="loading plugin "io.containerd.grpc.v1.namespaces"..." type=io.containerd.grpc.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263807278Z" level=info msg="loading plugin "io.containerd.internal.v1.opt"..." type=io.containerd.internal.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263947799Z" level=info msg="loading plugin "io.containerd.grpc.v1.snapshots"..." type=io.containerd.grpc.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263971592Z" level=info msg="loading plugin "io.containerd.grpc.v1.tasks"..." type=io.containerd.grpc.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.263987446Z" level=info msg="loading plugin "io.containerd.grpc.v1.version"..." type=io.containerd.grpc.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.264001179Z" level=info msg="loading plugin "io.containerd.grpc.v1.introspection"..." type=io.containerd.grpc.v1
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.264194211Z" level=info msg=serving... address=/var/run/docker/containerd/containerd-debug.sock
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.264252887Z" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock.ttrpc
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.264299356Z" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.264314178Z" level=info msg="containerd successfully booted in 0.053975s"
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.271530892Z" level=info msg="parsed scheme: "unix"" module=grpc
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.271565662Z" level=info msg="scheme "unix" not registered, fallback to default scheme" module=grpc
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.271592175Z" level=info msg="ccResolverWrapper: sending update to cc: {&#91;{unix:///var/run/docker/containerd/containerd.sock  &lt;nil&gt; 0 &lt;nil&gt;}] &lt;nil&gt; &lt;nil&gt;}" module=grpc
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.271613087Z" level=info msg="ClientConn switching balancer to "pick_first"" module=grpc
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.272302289Z" level=info msg="parsed scheme: "unix"" module=grpc
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.272321745Z" level=info msg="scheme "unix" not registered, fallback to default scheme" module=grpc
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.272351355Z" level=info msg="ccResolverWrapper: sending update to cc: {&#91;{unix:///var/run/docker/containerd/containerd.sock  &lt;nil&gt; 0 &lt;nil&gt;}] &lt;nil&gt; &lt;nil&gt;}" module=grpc
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.272365620Z" level=info msg="ClientConn switching balancer to "pick_first"" module=grpc
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.342826242Z" level=warning msg="Your kernel does not support swap memory limit"
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.342846686Z" level=warning msg="Your kernel does not support CPU realtime scheduler"
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.342999134Z" level=info msg="Loading containers: start."
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.417804617Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.465223597Z" level=info msg="Loading containers: done."
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.494192010Z" level=info msg="Docker daemon" commit=b0f5bc3 graphdriver(s)=overlay2 version=20.10.7
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.494297708Z" level=info msg="Daemon has completed initialization"
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.559800797Z" level=info msg="API listen on /var/run/docker.sock"
dind_1_a0ef0b8a29e4 | time="2021-06-19T09:44:43.565866502Z" level=info msg="API listen on &#91;::]:2376"
runner_1_3cb60d519ae8 | Checking for jobs... received                       job=870 repo_url=https://example.com/myrepo/myproject.git runner=A6qDsS-H
runner_1_3cb60d519ae8 | ERROR: Failed to remove network for build           error=networksManager is undefined job=870 network= project=90 runner=A6qDsS-H
runner_1_3cb60d519ae8 | WARNING: Preparation failed: Cannot connect to the Docker daemon at tcp://dind:2375. Is the docker daemon running? (docker.go:865:0s)  job=870 project=90 runner=A6qDsS-H
runner_1_3cb60d519ae8 | Will be retried in 3s ...                           job=870 project=90 runner=A6qDsS-H
runner_1_3cb60d519ae8 | ERROR: Failed to remove network for build           error=networksManager is undefined job=870 network= project=90 runner=A6qDsS-H
runner_1_3cb60d519ae8 | WARNING: Preparation failed: Cannot connect to the Docker daemon at tcp://dind:2375. Is the docker daemon running? (docker.go:865:0s)  job=870 project=90 runner=A6qDsS-H
runner_1_3cb60d519ae8 | Will be retried in 3s ...                           job=870 project=90 runner=A6qDsS-H
runner_1_3cb60d519ae8 | ERROR: Failed to remove network for build           error=networksManager is undefined job=870 network= project=90 runner=A6qDsS-H
runner_1_3cb60d519ae8 | WARNING: Preparation failed: Cannot connect to the Docker daemon at tcp://dind:2375. Is the docker daemon running? (docker.go:865:0s)  job=870 project=90 runner=A6qDsS-H
runner_1_3cb60d519ae8 | Will be retried in 3s ...                           job=870 project=90 runner=A6qDsS-H
runner_1_3cb60d519ae8 | ERROR: Job failed (system failure): Cannot connect to the Docker daemon at tcp://dind:2375. Is the docker daemon running? (docker.go:865:0s)  duration_s=9.004197071 job=870 project=90 runner=A6qDsS-H
runner_1_3cb60d519ae8 | WARNING: Failed to process runner                   builds=0 error=Cannot connect to the Docker daemon at tcp://dind:2375. Is the docker daemon running? (docker.go:865:0s) executor=docker runner=A6qDsS-H

The critical lines being:

WARNING: Preparation failed: Cannot connect to the Docker daemon at tcp://dind:2375. Is the docker daemon running? (docker.go:865:0s)

This setup is for GitLab CI, where I run GitLab Runner through docker compose.

Here’s my docker-compose.yaml config, for what it’s worth:

version: '3'

services:

  dind:
    restart: always
    privileged: true
    volumes:
    - /var/lib/docker
    image: docker:17.09.0-ce-dind 
    entrypoint: &#91;"dockerd-entrypoint.sh", "--tls=false", "--storage-driver=overlay2"]

  runner:
    restart: always
    image: gitlab/gitlab-runner:alpine
    volumes:
    - ./gitlab/runner:/etc/gitlab-runner:Z
    - ./gitlab/runner/builds:/builds
    environment:
    - DOCKER_HOST=tcp://dind:2375
      
  register-runner:
    restart: 'no'
    image: gitlab/gitlab-runner:alpine
    volumes:
    - ./gitlab/runner:/etc/gitlab-runner:Z
    command:
    - register
    - --non-interactive
    - --locked=false
    - --name=mybox
    - --executor=docker
    - --docker-image=docker:19.03.12
    - --docker-privileged
    environment:
    - CI_SERVER_URL=http://example.com/
    - REGISTRATION_TOKEN=my-token-here

(careful, this won’t copy paste due to WP funking up the encoding)

Note, the docker image version used by dind is the most important part here. The docker image version used by register-runner doesn’t seem to matter.

Prior to this I tried the very latest docker image, then docker:19.03.12 as per the official GitLab docs (at the time of writing), and then fortunately, I had my ancient configs which gave me the heads up to try a much older version of Docker.

So it seems using the older docker version ‘fixes’ this. I don’t know why – and I don’t have time (nor really, the inclination) to investigate. If you’re looking for a quick fix, hopefully this works for you. And if you do have the proper fix, please let me know via a comment.

My GitLab Runner Config.toml [Example]

I hit on an annoying issue this week, which I’m not sure of the root cause.

Last week I bumped GitLab from 10.6, to 10.8, and somehow broke my GitLab CI Runner.

Somewhere, I have a backup of the config.toml file I was using. I run my GitLab CI Runner in a Docker container. I only run one, as it’s only for my projects. And one is enough.

Somehow, the Runner borked. And annoyingly I neither had a reference of the running version (never use :latest unless you like uncertainty), and recreating without the config.toml file has been a pain.

So for my own future reference, here is my current GitLab Runner config.toml file:

user@8818901c05c8:/# cat /etc/gitlab-runner/config.toml

concurrent = 1
check_interval = 0

[[runners]]
  name = "runner-1"
  url = "https://my.gitlab.url"
  token = "{redacted}"
  executor = "docker"
  [runners.docker]
    tls_verify = false
    image = "docker:dind"
    privileged = true
    pull_policy = "if-not-present"
    disable_cache = false
    volumes = ["/var/run/docker/sock:/var/run/docker.sock","/cache"]
    shm_size = 0
  [runners.cache]
    insecure = false

FWIW this isn’t perfect. I’m hitting on a major issue currently whereby GitLab CI Pipeline stages with multiple jobs in the stage are routinely failing. It’s very frustrating. It’s also not scheduled for fix until v11, afaik.

Almost a year on

Wow, it’s almost a year since I last destroyed my personal GitLab.

Back then I was running the omnibus edition. Since then I’ve been rocking sameersbn/docker-gitlab.

Highly recommended. Love me some GitLab.

And yes, last, because I have unfortunately destroyed my GitLab three times so far. Each time is an extreme sad panda situation. I have backups, thankfully, but it still sucks.

How I Fixed: “error authorizing context: authorization token required”

I love me some Dockerised GitLab. I have the full CI thing going on, with a private registry for all my Docker images that are created during the CI process.

It all works real nice.

Until that Saturday night, when suddenly, it doesn’t.

Though it sounds like I’m going off on a tangent, it’s important to this story that you know I recently I changed my home broadband ISP.

I host one of my GitLab instances at my house. All my GitLab instances are now Dockerised, managed by Rancher.

I knew that as part of switching ISPs, there might (read: 100% would) be “fun” with firewalls, and ports, and all that jazz.

I thought I’d got everything sorted, and largely, I had.

Except I decided that whilst all this commotion was taking place, I would slightly rejig my infrastructure.

I use LetsEncrypt for SSL. I use the LetsEncrypt certs for this particular GitLab’s private registry.

I had the LetsEncrypt container on one node, and I was accessing the certs via a file share. It seemed pointless, and added complexity (the afore mentioned extra firewall rules), which I could remove if I moved the container on to the same box as the GitLab instance.

I made this move.

Things worked, and I felt good.

Then, a week or so later, I made some code changes and pushed.

The build failed almost immediately. Not what I needed on a Saturday night.

In the build logs I could see this:

Error response from daemon: Get https://my.gitlab:5000/v2/: received unexpected HTTP status: 500 Internal Server Error

This happened when the CI process was trying to log in to the private registry.

After a bit of head scratching, I tried from my local machine and sure enough I got the same message.

My Solution

As so many of my problems seem to, it boiled down to permissions.

Rather than copy the certs over from the original node, I let LetsEncrypt generate some new ones. Why not, right?

This process worked.

The GitLab and the Registry containers used a bind mounted volume to access the LetsEncrypt cert inside the container on the path /certs/.

When opening each container, I would be logged in as root.

Root being root, I had full permissions. I checked each file with a cheeky cat and visually confirmed that all looked good.

GitLab doesn’t run as root, however, and as the files were owned by root, and had 600 permissions:

Completed 500 Internal Server Error in 125ms (ActiveRecord: 7.2ms)
Errno::EACCES (Permission denied @ rb_sysopen - /certs/privkey.pem):
lib/json_web_token/rsa_token.rb:20:in `read'
lib/json_web_token/rsa_token.rb:20:in `key_data'
lib/json_web_token/rsa_token.rb:24:in `key'
lib/json_web_token/rsa_token.rb:28:in `public_key'
lib/json_web_token/rsa_token.rb:33:in `kid'
lib/json_web_token/rsa_token.rb:12:in `encoded'

The user GitLab is running as doesn’t have permission to read the private key.

Some more error output that may help future Googlers:

21/01/2018 21:31:51 time="2018-01-21T21:31:51.048129504Z" level=warning msg="error authorizing context: authorization token required" go.version=go1.7.6 http.request.host="my.gitlab:5000" http.request.id=4d91b482-1c43-465d-9a6e-fab6b823a76c http.request.method=GET http.request.remoteaddr="10.42.18.141:36654" http.request.uri="/v2/" http.request.useragent="docker/17.12.0-ce go/go1.9.2 git-commit/d97c6d6 kernel/4.4.0-109-generic os/linux arch/amd64 UpstreamClient(Docker-Client/17.12.0-ce (linux))" instance.id=24bb0a87-92ce-47fc-b0ca-b9717eabf171 service=registry version=v2.6.2
21/01/2018 21:31:5110.42.16.142 - - [21/Jan/2018:21:31:51 +0000] "GET /v2/ HTTP/1.1" 401 87 "" "docker/17.12.0-ce go/go1.9.2 git-commit/d97c6d6 kernel/4.4.0-109-generic os/linux arch/amd64 UpstreamClient(Docker-Client/17.12.0-ce (linux))"

Argh.

Thankfully I hadn’t deleted the old cert, so I went back and saw that I had previously set 0640 on the private key in the old setup.

Directory permissions for the certs was set to 0750 with execute being required as well as read.

In my case this was sufficient to satisfy GitLab.

When making the change on the new node, I could then immediately log back in.

A Tip To Spot This Sooner

I would strongly recommend that you schedule your project to run a build every ~24 hours, even if nothing has changed.

This will catch weird quirks that aren’t related to your project, but have inadvertently broken your project’s build.

It’s much easier to diagnose problems whilst they are fresh in your mind.

Also, ya’ know, better documentation! This is exactly why I’m now writing this post. So in the future when I inevitable make a similar mistake, I now know where to look first 🙂