Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux Virtual Server: Users

[lvs-users] problem with ldirectord- web server up/site down :(

 

 

Linux Virtual Server users RSS feed   Index | Next | Previous | View Threaded


davea at ingraftedsoftware

May 3, 2011, 4:17 AM

Post #1 of 1 (288 views)
Permalink
[lvs-users] problem with ldirectord- web server up/site down :(

We had a failure yesterday(and we have had this happen in the past about
once a month- I am now taking the time to post the problem) and one of
our web sites was unavailable. After a few minutes of investigation, I
found that the load-balancer did not have any hosts in the rotation for
that site. All 3 web servers were up and working so the check in
ldirectord should have had all 3 in the current running configuration of
ipvs. A simple restart of ldirectord caused all 3 web servers to be
added back into the rotation immediately and the site was restored to
service.

There is no clustering software used in this current configuration.

It seems that ldirectord forgets what it is supposed to do over time(a
few weeks) and a simple restart makes it happy again, as it has in this
case and in previous cases.

Here are the software versions for the loadbalancer:
CentOS release 5.5 x86_64
ldirectord-1.0.4-1.1.el5
kernel 2.6.18-194.32.1.el5

Here are the important parts of the ldirectord.cf file (anonymized)
=============================
# Global Directives
checktimeout=20
checkinterval=30
autoreload=yes
logfile="local0"
quiescent=no
fork=yes

# http virtual service for redirecting port 80 to my.securesite.com
virtual=192.168.35.117:80
real=192.168.35.43:80 gate 100
real=192.168.35.44:80 gate 100
real=192.168.35.45:80 gate 100
service=http
scheduler=rr
netmask=255.255.255.255
protocol=tcp

# http virtual service for my.securesite.com
virtual=192.168.35.117:443
real=192.168.35.43:40117 gate 100
real=192.168.35.44:40117 gate 100
real=192.168.35.45:40117 gate 100
service=https
scheduler=wlc
persistent=600
netmask=255.255.255.255
protocol=tcp
virtualhost=my.securesite.com
=============================

/etc/ipvsadm.rules
=============================
(no entry for this host- let ldirectord figure it out)
(note: I have since ADDED the rules here for the 117 https host
but I don't see how not having it matters as ldirectord manages that.)
=============================

The logs had no place where the actual site was removed from ipvs. It
did have some like the following with "failed" - notice the timestamps:

May 1 21:10:56 lb71 ldirectord[7336]: system(/sbin/ipvsadm -a -t
63.251.35.117:80 -r 192.168.35.45:80 -g -w 100) failed:
May 1 21:10:56 lb71 ldirectord[7336]: Added real server:
192.168.35.45:80 (192.168.35.117:80) (Weight set to 100)
May 1 21:10:56 lb71 ldirectord[7343]: Resetting soft failure count:
192.168.35.45:40117 (tcp:192.168.35.117:443)
May 1 21:10:56 lb71 ldirectord[7343]: system(/sbin/ipvsadm -a -t
192.168.35.117:443 -r 192.168.35.45:40117 -g -w 100) failed:
May 1 21:10:56 lb71 ldirectord[7343]: Added real server:
192.168.35.45:40117 (192.168.35.117:443) (Weight set to 100)

Is this a bug in ldirectord? Some thing wrong in my config? Should I
look to keepalived? mon?

Thanks,
Dave

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Linux Virtual Server users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.