Bug 73097 is closed, but clearly not resolved. This affects Interworx in that the automatic updates, which call Yum (which is itself a front-end to rpm), can get bitten by this bug. In my case, updates hadn’t run for a long time. I didn’t notice this until I logged in on the CLI to install some software, but Yum hung. When doing an strace, I notced that RHL9’s latest rpm still has the problem (and from my research, Fedora may as well).
# strace -p 6283 futex(0x405bb400, FUTEX_WAIT, 0, NULL) = -1 EINTR (Interrupted system call) [this is where a "pkill -9 rpm" was issued at a different terminal] --- SIGINT (Interrupt) @ 0 (0) --- sigreturn() = ? (mask now ) futex(0x405bb400, FUTEX_WAIT, 0, NULL) = -1 EINTR (Interrupted system call) +++ killed by SIGKILL +++
Once it’s hung like this, only a kill -9 will stop the rpm process(es). However, once a SIGKILL is sent to rpm, the database WILL be corrupted. The only fix for this is to delete the rpm databases:
# rm /var/lib/rpm/__db.*
When I discovered the problem on my server, there were 23 hung yum update processes. Perhaps a cron job could be fired to look for the problem, and kill off hung processes when found (if an rpm-related process is killed, don’t forget to delete the /var/lib/__db.* files).
Again, this is not an InterWorx bug, it’s a RedHat bug. I was just hoping that, since IWORX fires yum update automatically, a test for this could also run. If I hadn’t logged in to install a package, I fear how long it would have taken me to notice, and how long a critical update would have gone uninstalled.