reduzent at gmail
Apr 4, 2012, 1:39 AM
Post #1 of 1
We're running a two node cluster with a bunch of OpenVZ Containers as
Resources and use SBD as a fencing method. We're still in testing mode
and did perform some IO benchmarks on NFS with tiobench. While we were
performing those test, the node fenced itself as soon as tiobench was
finished doing the test. We looked for the reason and found the
following line in the syslog:
Apr 2 16:44:07 hostname sbd: : WARN: Latency: No liveness for 4 s exceeds threshold of 3 s (healthy servants: 0)
I assume this could be prevented by setting sbd's -5 flag to a higher
value than the default of 3s. But what is a good value?
However, the NFS where we were the performing the tests on is provided
by a different host than the one that provides the iSCSI device used by
SBD. How comes that those two interfere?
Next time we monitored the sbd access time during the benchmark with
$ while true; do (time sbd -d /dev/sdd list) 2>&1 | grep real; sleep 1; done
During the test it was usually ~0.030s. However, just when the test
finished, it was much higher, like 2-4s.
Actually, we are not so much concerned about this right now, but we
would like to make sure, that it is not possible to fence the whole node
by a Container doing extensive IO. How can this be safely prevented?
Linux-HA mailing list
Linux-HA [at] lists
See also: http://linux-ha.org/ReportingProblems