HAProxy external health checks being killed (running in Docker)

Detailed Description of the Problem

When using external health checks on Almalinux 9 with HAProxy running in docker, the healthchecks are being killed [WARNING] (8) : kill 12 even if the external check is something simple like

echo "my test"
exit 0

But when running on Ubuntu (24.04) or macOS it works perfectly fine

Expected Behavior

HAProxy should run the health check script successfully (and not kill it) on Almalinux 9 as it does when using Ubuntu/macOS

Steps to Reproduce the Behavior

  1. git clone https://gist.github.com/nmcc1212/ddf90e337653da1b8d3f6a73436b73c9
  2. cd ddf90e337653da1b8d3f6a73436b73c9
  3. chmod +x primary-check.sh
  4. docker compose up

(on almalinux 9) the below output

haproxy  | [NOTICE]   (1) : Initializing new worker (8)
haproxy  | [NOTICE]   (1) : Loading success.
haproxy  | [WARNING]  (8) : kill 12
haproxy  | [WARNING]  (8) : Server primary/t1 is DOWN, reason: External check timeout, code: 0, check duration: 3003ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy  | [WARNING]  (8) : kill 13
haproxy  | [WARNING]  (8) : Server primary/t2 is DOWN, reason: External check timeout, code: 0, check duration: 3002ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy  | [ALERT]    (8) : proxy 'primary' has no server available!
haproxy  | [WARNING]  (8) : kill 14
haproxy  | [WARNING]  (8) : kill 15
haproxy  | [WARNING]  (8) : kill 16
haproxy  | [WARNING]  (8) : kill 17

(on ubuntu 24.04 and macOS Tahoe) - works as expected

haproxy  | [NOTICE]   (1) : Initializing new worker (8)
haproxy  | [NOTICE]   (1) : Loading success.
haproxy  | my test
haproxy  | my test
haproxy  | my test

Configuration

see https://gist.github.com/nmcc1212/ddf90e337653da1b8d3f6a73436b73c9

Additional Information

both Ubuntu 24.04 and Almalinux 9 vms are using Docker version 28.4.0, build d8eb465

hello

Couldn’t it be selinux?
Try disabling it.

That’s the only difference I can think of between this and other operating systems.

i have done sudo setenforce 0 but get the same results, also nothing in /var/log/messages from selinux blocking anything

[nial@localhost ddf90e337653da1b8d3f6a73436b73c9]$ getenforce
Permissive
[nial@localhost ddf90e337653da1b8d3f6a73436b73c9]$ sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   permissive
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Memory protection checking:     actual (secure)
Max kernel policy version:      33
[nial@localhost ddf90e337653da1b8d3f6a73436b73c9]$ docker compose up
Attaching to haproxy
haproxy  | [NOTICE]   (1) : Initializing new worker (8)
haproxy  | [NOTICE]   (1) : Loading success.
haproxy  | [WARNING]  (8) : kill 12
haproxy  | [WARNING]  (8) : Server primary/t1 is DOWN, reason: External check timeout, code: 0, check duration: 3002ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy  | [WARNING]  (8) : kill 13
haproxy  | [WARNING]  (8) : Server primary/t2 is DOWN, reason: External check timeout, code: 0, check duration: 3002ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.

Hello
Isn’t Almalinux’s seccomp stricter compared to other operating systems?

Seccomp might be blocking the system call.
Please refer to the following documentation
and try disabling seccomp.

thank you

tried the below compose spec

services:
  haproxy:
    image: haproxy:3.2
    container_name: haproxy
    ports:
      - 3500:3500
      - 7000:7000
    volumes:
      - ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
      - ./primary-check.sh:/usr/local/bin/primary-check.sh
    security_opt:
        - seccomp=unconfined

and adding the below to /etc/docker/daemon.json (i reboot the vm after)

{
  "seccomp-profile": "unconfined"
}

but i still get

docker compose up
[+] Running 2/2
 ✔ Network ddf90e337653da1b8d3f6a73436b73c9_default  Created                                                                                     0.5s
 ✔ Container haproxy                                 Created                                                                                     0.2s
Attaching to haproxy
haproxy  | [NOTICE]   (1) : Initializing new worker (8)
haproxy  | [NOTICE]   (1) : Loading success.
haproxy  | [WARNING]  (8) : kill 12
haproxy  | [WARNING]  (8) : Server primary/t1 is DOWN, reason: External check timeout, code: 0, check duration: 3002ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy  | [WARNING]  (8) : kill 13
haproxy  | [WARNING]  (8) : Server primary/t2 is DOWN, reason: External check timeout, code: 0, check duration: 3003ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy  | [ALERT]    (8) : proxy 'primary' has no server available!
haproxy  | [WARNING]  (8) : kill 14
haproxy  | [WARNING]  (8) : kill 15

This is the final idea. It will help isolate whether the issue lies with HAProxy or the entire Docker environment on the AlmaLinux host.
Try running a completely different container, like Nginx.
This should help determine whether the problem is with HAProxy or Docker.

Beyond that, the only other option is to check if anything is showing up in the Docker logs.

※ Sorry, I don’t know about Nginx images.
thank you

unfortunately NGiNX doesn’t support using external scripts for health checks, but I can create a busybox container that executes the script, which works as expected

services:
  busybox:
    image: busybox:latest
    command: ["sh", "/usr/local/bin/primary-check.sh"]
    volumes:
      - ./primary-check.sh:/usr/local/bin/primary-check.sh:ro
    restart: unless-stopped
[nial@localhost busybox]$ docker compose up
[+] Running 2/2
 ✔ Network busybox_default      Created                                                                                                        0.5s
 ✔ Container busybox-busybox-1  Created                                                                                                        0.1s
Attaching to busybox-1
busybox-1  | my test
busybox-1 exited with code 0 (restarting)
busybox-1  | my test
busybox-1 exited with code 0 (restarting)
busybox-1  | my test
busybox-1 exited with code 0 (restarting)
busybox-1  | my test
busybox-1 exited with code 0 (restarting)

Thank you for verifying.

I reviewed the logs again.

Is there an issue with haproxy?

Server primary/t1 is DOWN, reason: External check timeout, code: 0, check duration: 3002ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.

It appears haproxy is timing out external communication at 3 seconds and going down,

so it looks like the server itself is killing the process.

Wouldn’t extending the timeout interval be the solution?

this does not explain why the same docker compose and health check scripts would work on ubuntu and macOS though?

also since the test script i am using is just

echo "my test"
exit 0

this should obviously not take over 3 seconds to run

Thank you for your reply.

I’m getting more and more confused.
Isn’t the project directory located on a network drive like NFS?

Theres no network share or NFS involved, as I said in the Steps to Reproduce in the original post, theres just a docker compose file, a haproxy config, and the bare bones health check script. These can be seen in the gist link provided.

And using the exact same gist works on Ubuntu and macOS, But not Alma 9

Also just tested it on Rocky Linux 9, it has the same issue, so likely a RHEL issue?

Hello

Which Ubuntu version is known to work without issues?

If containers aren’t running, differences in Cgroup might be the cause. Almalinux uses Cgroup v2.

If Ubuntu is using Cgroup v2,
the likelihood of Cgroup being the issue decreases.

Ubuntu 20.04 Cgroup V1
Ubuntu 22.04 Cgroup V2
Ubuntu 24.04 Cgroup V2

Thanks

it was in my original post, Ubuntu 24.04 (and macOS Tahoe)

I tried verifying it, but it didn’t reproduce.

Did something like the OOM killer trigger in dmesg?

[redadmin@alma9 ~]$ git clone https://gist.github.com/nmcc1212/ddf90e337653da1b8d3f6a73436b73c9
Cloning into 'ddf90e337653da1b8d3f6a73436b73c9'...
remote: Enumerating objects: 5, done.
remote: Total 5 (delta 0), reused 0 (delta 0), pack-reused 5 (from 1)
Receiving objects: 100% (5/5), done.
[redadmin@alma9 ~]$ ll
total 0
drwxr-xr-x. 3 redadmin redadmin 87 Oct  8 20:34 ddf90e337653da1b8d3f6a73436b73c9
[redadmin@alma9 ~]$ cd ddf90e337653da1b8d3f6a73436b73c9/
[redadmin@alma9 ddf90e337653da1b8d3f6a73436b73c9]$ chmod +x primary-check.sh
[redadmin@alma9 ddf90e337653da1b8d3f6a73436b73c9]$ docker compose up
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
>>>> Executing external compose provider "/usr/bin/podman-compose". Please see podman-compose(1) for how to disable this message. <<<<

a15da03073eac72dbd37d16ba9310114f4eb849d581d5d6cf25b00af14391709
✔ docker.io/library/haproxy:3.2
Trying to pull docker.io/library/haproxy:3.2...
Getting image source signatures
Copying blob 4f4fb700ef54 done   |
Copying blob 8c7716127147 done   |
Copying blob bbc9e592cadb done   |
Copying blob d5223d15135d done   |
Copying blob c9f3668e4cd0 done   |
Copying blob 30862e038e91 done   |
Copying config 52ca014edb done   |
Writing manifest to image destination
a944359c443a241d0a589c5875b235cf4b70d23a0683d2dc35dc7d309cbd9181
[haproxy] | [NOTICE]   (1) : Initializing new worker (3)
[haproxy] | [NOTICE]   (1) : Loading success.
[haproxy] | my test
[haproxy] | my test
[haproxy] | my test
[haproxy] | my test
[haproxy] | my test
[haproxy] | my test

I just tested, podman does not have this issue/bug, only docker. I have raised this on docker forum now too HAProxy external health checks being killed, only on RHEL9/10 based OS - Support - Docker Community Forums

1 Like