RSH problem - Help please

Hi,
We have some strange issue with Almalinux 9.3
I know that rsh is insecure, but in my business it is the standard. We run a simulation cluster.

Problem:
After a while of use, rsh no longer works properly on some servers. As soon as the error occurs, I can no longer execute commands (e.g. rsh [remotehost] uptime). However, logging in to the remote host with the “rsh [remotehost]” command works.

When checking the service with:
systemctl status rsh.socket, the following output appears:

× rsh.socket - Remote Shell Facilities Activation Socket

Loaded: loaded (/usr/lib/systemd/system/rsh.socket; enabled; preset: disabled)

Active: failed (Result: trigger-limit-hit) since Thu 2024-06-20 09:31:52 CEST; 2h 38min ago
Duration: 2d 21h 1min 56.313s
Listen: [::]:514 (Stream)
Accepted: 1066; Connected: 0; Refused: 1
CPU: 356ms

Jun 17 12:29:55 remotehost systemd[1]: Listening on Remote Shell Facilities Activation Socket.
Jun 20 09:31:52 remotehost systemd[1]: rsh.socket: Trigger limit hit, refusing further activation.
Jun 20 09:31:52 remotehost systemd[1]: rsh.socket: Failed with result ‘trigger-limit-hit’.

After a reboot everything is fine. (i think for a while)

What could be the reason for the error?

I hope you can help me here, as it would be relatively important.

Many thanks in advance

Check systemctl show --property=TriggerLimitIntervalUSec,TriggerLimitBurst rsh.socket. The default is like 200 per 2 seconds, and whatever parallel spawn you’re doing may be exceeding that, especially if it’s per core.