This is turning out to be quite a PITA.
DRBD:
RHEL/CentOS/SL don’t include drbd, so you’re stuck either compiling or using elrepo-testing to get it for centos6 - not planning to use centos 5.7 for anything forward-thinking. Not a huge issue, but to be safe, this requires also using the protect-base pluging, just in case something gets added to elrepo which is newer than centos or iworx, especially when you consider this requires using “testing.” That said, it appears, at least at fist blush, to work just fine, though I haven’t actually gotten a fully-working setup going. I’ll admit I’m a little squeamish about this.
Pacemaker/Corosync:
Still working on this. Current issues, some overcome, others not:
SIM versus corosync resource monitoring:
It appears easy enough to simply use corosync ( or is is pacemaker? not sure whose resources they really are in HA these days) resource monitoring instead of SIM for things like apache and mysql. Haven’t run into any issues here yet, but I don’t really like taking this functionality away from iworx, since I believe this means that iworx nodes won’t restart services automatically. I don’t want the two arguing over who restarts daemons. Kind of fugly.
Getting HA cluster IPs to show up in iworx apparently requires them having a name/label. In other words IPs that show up in “ip addr” but not in “ifconfig” don’t show up in the iworx ip management screen. Easy to fix with the IPaddr2 options “nic=” and “iflabel=” - once that’s done, they show up in iworx just fine. I’m hoping that the service “iworx” will handle moving the IPs generated by iworx itself. I haven’t proved this out yet, either, but hopefully will this week. TBH, I haven’t even proved out that “iworx” is LSB compliant yet, though usually that’s workable with a little hackery (I’ve done it in the past with saslauthd and postfix-policyd).
Apparently the version of OCF IPsrcaddr shipped with centos6 (or perhaps its centos6 itself - I haven’t been able to determine this since an older version works just fine with ubuntu) bombs out when adding a route with a src addy, so I get corosync logs with stuff about “either “to” is a duplicate or “x.x.x.x/x” is a garbage” and the route add fails. Running a manual “ip route add” works, and I’ve found a few posts in various newsgroups indicating there’s a patch that fixes this, including new docs that imply you can also add a netmask directive in newer versions of this OCF. Bummer, since this is essential for active/passive CM + iworx cluster for iworx licensing as well as things like NFS mounts.
This last obstacle appears to be a show-stopper, or at least any hackish workarounds that I can come up with so far are way to gross to put into production. I could get around the licensing issue by simply purchasing an additional license for a host that just lies dormant (hopefully) 99% of the time (if I could convince management to pony up, that is), but so far I can’t figure out how to deal with the NFS issue.
Centos doesn’t appear to be working very hard to maintain parity with rhel at this point, and TBH, I have no idea what the status of any of this is in rhel-land these days, though I’ve heard that corosync, at least, is unsupported pay-to-play. I believe that DRBD is in mainline kernel in more recent kernels. I may see if scientific linux, which is much closer to parity with rhel, helps any with these issues, and if it does, see how upset iworx would be with me running on a distro which should be 100% compatible but is not listed on their “supported distros.”
TLDR: no workee currently