When your server is down, it’s stressful. I’ve just recovered from an issue with WordPress where the PHP container was crippling my server because – for reasons I cannot explain – connectivity to the database had just died.
This actually happened to me on another server a few months ago, and I wasn’t that bothered about the site enough to figure out the problem, so I left the server offline and let the domain expire.
Today, not so much. Yesterday, and then earlier today I got the following alert from Jetpack:
Never a good time.
SSH into the server, and sure enough, the load averages are all well into the 30’s.
At this point, I actually didn’t even consider this being a PHP / MySQL issue. Usually the WordPress sites I run are stable. I keep the PHP and DB versions update fairly frequently, and WordPress itself takes care of automatically patching the horror show internally.
Looking at htop
though told a different story. I didn’t take screenshots, because, well, documenting the mess was not as high a priority (on a Sunday afternoon) as fixing it so I could get back to … err, doing nothing.
Here’s what it sort of looked like, but after the fact:
There were about 8 pids, all of them php-fpm
, all of them taking up about 50% CPU and not dropping.
The way I’ve configured these is probably not helpful. All my setups use the www-data
user inside the PHP container.
Whilst this makes the PHP image very easy to re-use, it doesn’t give me an at-a-glance overview of which container is killing the server.
Well, here’s a command to figure it out, based on the pid:
sudo lsof -p 46170
Of course, change the pid (46170 above) to match the one shown in the output of htop
.
I repeated this for the top 3 pids and was surprised to see each one was the same container. Here’s the output, but with a lot of the data redacted because I’m not sure how insecure sharing this stuff might be:
In the “Name” column, there was a path to the specific site directory structure that clearly told me what the container related too.
When I looked at the container logs for the PHP instance I could see constant connections:
192.168.224.4 - 03/Nov/2024:13:40:31 +0000 "GET /index.php" 200
192.168.224.4 - 03/Nov/2024:13:40:31 +0000 "GET /index.php" 200
192.168.224.4 - 03/Nov/2024:13:40:31 +0000 "GET /index.php" 200
192.168.224.4 - 03/Nov/2024:13:40:31 +0000 "GET /index.php" 200
192.168.224.4 - 03/Nov/2024:13:40:31 +0000 "GET /index.php" 200
[03-Nov-2024 13:40:32] WARNING: [pool www] server reached pm.max_children setting (5), consider raising it
192.168.224.4 - 03/Nov/2024:13:40:31 +0000 "GET /index.php" 200
192.168.224.4 - 03/Nov/2024:13:40:32 +0000 "GET /index.php" 200
192.168.224.4 - 03/Nov/2024:13:40:32 +0000 "GET /index.php" 200
192.168.224.4 - 03/Nov/2024:13:40:32 +0000 "GET /index.php" 200
Code language: JavaScript (javascript)
Bearing in mind, this server – on average – registers about 20 visitors a day – I figured something unusual was happening.
OK, first step, shut down the PHP container.
That immediately calmed down the server’s load average.
Problems With MySQL Upgrade
This site was running MySQL 8.3 in Docker. Specifically via Docker Compose. Here’s a simplified docker-compose.yaml
:
mysql:
container_name: bikeclimbs_blog_db
image: mysql:8.3
environment:
MYSQL_DATABASE: my_db
MYSQL_PASSWORD: my_user_password
MYSQL_ROOT_PASSWORD: my_root_password
MYSQL_USER: my_db_user
Code language: CSS (css)
I checked Docker Hub and saw version 8.4
was available, so immediately bumped to that, on the off-chance it was some bug with 8.3
:
2024-11-03 14:01:39+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.4.3-1.el9 started.
2024-11-03 14:01:40+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql'
2024-11-03 14:01:40+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.4.3-1.el9 started.
'/var/lib/mysql/mysql.sock' -> '/var/run/mysqld/mysqld.sock'
2024-11-03T14:01:40.907332Z 0 [System] [MY-015015] [Server] MySQL Server - start.
2024-11-03T14:01:41.202577Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.4.3) starting as process 1
2024-11-03T14:01:41.216038Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2024-11-03T14:01:44.005850Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2024-11-03T14:01:44.089574Z 1 [System] [MY-011090] [Server] Data dictionary upgrading from version '80200' to '80300'.
2024-11-03T14:01:46.401621Z 1 [System] [MY-013413] [Server] Data dictionary upgrade from version '80200' to '80300' completed.
2024-11-03T14:01:51.203555Z 4 [System] [MY-013381] [Server] Server upgrade from '80300' to '80403' started.
2024-11-03T14:02:08.311941Z 4 [System] [MY-013381] [Server] Server upgrade from '80300' to '80403' completed.
2024-11-03T14:02:08.445977Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2024-11-03T14:02:08.446029Z 0 [System] [MY-013602] [Server] Channel mysql_main configured to support TLS. Encrypted connections are now supported for this channel.
2024-11-03T14:02:08.451143Z 0 [Warning] [MY-011810] [Server] Insecure configuration for --pid-file: Location '/var/run/mysqld' in the path is accessible to all OS users. Consider choosing a different directory.
2024-11-03T14:02:08.472612Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Bind-address: '::' port: 33060, socket: /var/run/mysqld/mysqlx.sock
2024-11-03T14:02:08.472647Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.4.3' socket: '/var/run/mysqld/mysqld.sock' port: 3306 MySQL Community Server - GPL.
Code language: JavaScript (javascript)
The upgrade went through fine, but WordPress was still very much dead:
As I had a local nightly backup for this database, I figured I’d YOLO this and upgrade it to the latest MySQL tag, which via Docker Hub appeared to be 9.1
In hindsight this was pretty reckless, but I do trust my backups as in the past I’ve been a lot more carefree with my backups (see: how I lost all the content for this site), but now I do a much more rigorous process.
Anyway, when I bumped the tags, pulled the newer image, and restarted the container, the logs gave this:
2024-11-03 14:06:59+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql'
2024-11-03 14:06:59+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 9.1.0-1.el9 started.
'/var/lib/mysql/mysql.sock' -> '/var/run/mysqld/mysqld.sock'
2024-11-03T14:06:59.936797Z 0 [System] [MY-015015] [Server] MySQL Server - start.
2024-11-03T14:07:00.186165Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 9.1.0) starting as process 1
2024-11-03T14:07:00.194158Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2024-11-03T14:07:02.513677Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2024-11-03T14:07:03.834196Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2024-11-03T14:07:03.834253Z 0 [System] [MY-013602] [Server] Channel mysql_main configured to support TLS. Encrypted connections are now supported for this channel.
2024-11-03T14:07:03.839160Z 0 [Warning] [MY-011810] [Server] Insecure configuration for --pid-file: Location '/var/run/mysqld' in the path is accessible to all OS users. Consider choosing a different directory.
2024-11-03T14:07:03.870481Z 0 [Warning] [MY-010312] [Server] The plugin 'mysql_native_password' used to authenticate user 'mysql.session'@'localhost' is not loaded. Nobody can currently login using this account.
2024-11-03T14:07:03.870517Z 0 [Warning] [MY-010312] [Server] The plugin 'mysql_native_password' used to authenticate user 'mysql.sys'@'localhost' is not loaded. Nobody can currently login using this account.
2024-11-03T14:07:03.870533Z 0 [Warning] [MY-010312] [Server] The plugin 'mysql_native_password' used to authenticate user 'root'@'localhost' is not loaded. Nobody can currently login using this account.
2024-11-03T14:07:03.870547Z 0 [Warning] [MY-010312] [Server] The plugin 'mysql_native_password' used to authenticate user 'my_db_user'@'%' is not loaded. Nobody can currently login using this account.
2024-11-03T14:07:03.870560Z 0 [Warning] [MY-010312] [Server] The plugin 'mysql_native_password' used to authenticate user 'root'@'%' is not loaded. Nobody can currently login using this account.
2024-11-03T14:07:03.930661Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Bind-address: '::' port: 33060, socket: /var/run/mysqld/mysqlx.sock
2024-11-03T14:07:03.930779Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '9.1.0' socket: '/var/run/mysqld/mysqld.sock' port: 3306 MySQL Community Server - GPL.
Code language: JavaScript (javascript)
That’s… not so good, Al.
At this point I couldn’t log in. I use DataGrip as my DB tool, and remote connections were just dying.
I had made a major version bump, so there was no easy going back to v8.
OK, so it clearly says that the plugin ‘mysql_native_password’ used to authenticate user whatever is not loaded. How do we load it?
I tried a total hack:
mysql:
container_name: bikeclimbs_blog_db
image: mysql:8.3
environment:
MYSQL_DATABASE: my_db
MYSQL_PASSWORD: my_user_password
MYSQL_ROOT_PASSWORD: my_root_password
MYSQL_USER: my_db_user
command:
--default-authentication-plugin=mysql_native_password # Set the authentication plugin
volumes:
- ./my.cnf:/etc/mysql/my.cnf
Code language: PHP (php)
Where you set the my.cnf
file in the same directory as your docker-compose.yaml
file on the server, with the contents:
[mysqld]
default-authentication-plugin=mysql_native_password
Code language: JavaScript (javascript)
Needless to say, this did not work:
2024-11-03T14:12:06.284279Z 0 [System] [MY-015015] [Server] MySQL Server - start.
2024-11-03T14:12:06.577076Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 9.1.0) starting as process 1
2024-11-03T14:12:06.589066Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2024-11-03T14:12:09.227643Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2024-11-03T14:12:10.473732Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2024-11-03T14:12:10.473898Z 0 [System] [MY-013602] [Server] Channel mysql_main configured to support TLS. Encrypted connections are now supported for this channel.
2024-11-03T14:12:10.483542Z 0 [Warning] [MY-011810] [Server] Insecure configuration for --pid-file: Location '/var/run/mysqld' in the path is accessible to all OS users. Consider choosing a different directory.
2024-11-03T14:12:10.489668Z 0 [ERROR] [MY-000067] [Server] unknown variable 'default-authentication-plugin=mysql_native_password'.
2024-11-03T14:12:10.492688Z 0 [ERROR] [MY-010119] [Server] Aborting
Code language: JavaScript (javascript)
Back all that crap out, and bring the original config back up, then try to connect via CLI:
$ docker exec -it bikeclimbs_blog_db mysql -u root -p
Enter password:
ERROR 1524 (HY000): Plugin 'mysql_native_password' is not loaded
Code language: JavaScript (javascript)
Nope.
Basically the same error I was seeing in DataGrip.
Boot Into Recovery Mode
A trick I learned from a DBA a good many years ago was that MySQL can (or could) be booted into a kind of recovery mode.
The way to do this was to pass the flag --skip-grant-tables
in when starting MySQL, and it would boot up without any authentication enabled.
Sounds terrible.
But in this case, it would be ideal.
I could boot into this ‘recovery mode’, set my new password to the caching_sha2_password
variety that MySQL 9 seemed to need, and then remove that flag and restart once more, back into ‘normal’ mode. At that point, in theory, I would be set.
mysql:
container_name: bikeclimbs_blog_db
image: mysql:8.3
environment:
MYSQL_DATABASE: my_db
MYSQL_PASSWORD: my_user_password
MYSQL_ROOT_PASSWORD: my_root_password
MYSQL_USER: my_db_user
command: --skip-grant-tables
Code language: CSS (css)
Amazingly this … almost worked:
$ docker exec -it bikeclimbs_blog_db mysql -u root -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 9.1.0 MySQL Community Server - GPL
Copyright (c) 2000, 2024, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> ALTER USER 'root'@'%' IDENTIFIED WITH 'caching_sha2_password' BY 'super secure new password'
-> ;
ERROR 1290 (HY000): The MySQL server is running with the --skip-grant-tables option so it cannot execute this statement
mysql> exit
Bye
Code language: JavaScript (javascript)
So close.
The Solution
My solution came via this site:
How to reset root password in MySQL 8
This is for MySQL 8, but conceptually it works regardless.
The idea is to make MySQL run a SQL script at boot up.
It uses the same approach as adding the custom my.cnf
above – using a bind mounted file to pass a local script file into the container:
mysql:
container_name: bikeclimbs_blog_db
image: mysql:8.3
environment:
MYSQL_DATABASE: my_db
MYSQL_PASSWORD: my_user_password
MYSQL_ROOT_PASSWORD: my_root_password
MYSQL_USER: my_db_user
command: --init-file=/docker-entrypoint-initdb.d/init.sql
volumes:
- ./init.sql:/docker-entrypoint-initdb.d/init.sql
Code language: JavaScript (javascript)
Then create the file init.sql
in the same directory as your docker-compose.yaml
file with the contents that match the environment variables you have set for the user and root passwords:
ALTER USER 'root'@'%' IDENTIFIED WITH 'caching_sha2_password' BY 'my_root_password';
ALTER USER 'dbuser'@'%' IDENTIFIED WITH 'caching_sha2_password' BY 'my_user_password';
Code language: JavaScript (javascript)
Do not meddle with the paths.
Why use /docker-entrypoint-initdb.d
as the path when this is for an existing database?
Well, I don’t control the MySQL image and I do know that is one place user data can live. It’s the init script directory, so is expected to be populated with user scripts – be they SQL or otherwise. It’s a safe place, but in this case it’s not being used for initialisation. This will possibly work from other paths, but I wasn’t for messing about.
After this, my DB booted fine, and I saw no further warnings about attempted password usages with the old mysql_native_password
password type.
Be sure to remove the script file, the command
and volume
entries from your docker-compose.yaml
file and restart to ensure that those changes took effect.
Still, what a crap way to spend a Sunday.