Hi,
We have a client/server application which was developed a long time ago. It has been running in production for more than 10 years. The client is a Windows application written in C++, and the server-side component is written in Java8.
This client/server software has been working fine for a long time on Linux servers. Currently, we use AlmaLinux 9. It was working on AlmaLinux 9 until updating the kernel.
So, when we update the Linux kernel from “5.14.0-362.13.1.el9_3.x86_64” to “kernel-5.14.0-427.31.1.el9_4.x86_64” the application gets unstable: The client drops the connection based due to not receiving messages in the proper time. We notice delays, the client just waiting for the response from the server. The issue is always reproducible with the new kernel. And if we go back to the old kernel, the problem is gone. We kept running the test for hours in both cases.
I can provide PCAP files created by tcpdump tool in both cases: working and non-working scenarios.
Please investigate the issue that what happened between these two kernel versions. It seems there is an issue in the new kernel.
I already reported a bug on the kernel.org website.
Link: 219221 – TCP connection/socket gets stuck and the handshaking is delayed
You find the PCAP files there in the attachment.
Please analyze it, and try to figure out why the new kernel behaves differently and causes this weird behaviour.
Thanks a lot!
Regards,
Zoltan