Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux Virtual Server: Users

[lvs-users] LVS and stale connections

 

 

Linux Virtual Server users RSS feed   Index | Next | Previous | View Threaded


cova at ferrara

Mar 31, 2009, 5:08 AM

Post #1 of 8 (1213 views)
Permalink
[lvs-users] LVS and stale connections

Hi all,
we are using ipvs since years on several production servers and it works just
fine, but now we are facing a weird problem.
I'm unable to find a solution browsing around, so I'm asking for some hints
here.
Basically, the scenario is the following:
5 servers (RIP), balanced with ipvs on linux 2.6.23, DR is used.
All works just fine, but if for some reasons the real servers freezes and need
reboot, ipvs keeps as "alive" all the connections so the balancing is not
correctly done.
say, after reboot a couple (out of five) of servers shows with ipvsadm -L more
than 1000 connections (and all of those are fake, as the Rservers have been
rebooted) and ipvs doesn't pass any new connection until the count lowers
enough and only 3 servers are actually working.
To fix this, I've tried to remove from ipvs real servers, virtual IP, I've also
zeroed the tables but no success: when I reconfigure it all the RIP shows that
all the connections are still there. The situation improves only after the
connection timeout expires (usually 8 hours). If I lower this timeout, only
new connection expires in a shorter time, so this is not a good solution. We
need to keep high timeouts for several reasons (say, databases).
Is there any way to clear this counter (syscalls, proc, whatever) so I can
recover a correct behaviour of the balancer without booting the machine? I can
figure out that it's a situation that ipvs can't handle automtically, but a way
to clear tables should be useful.

If course I'm available for further details.

Many thanks in advance for any answer.




_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


jason.faulkner at mailtrust

Mar 31, 2009, 7:03 AM

Post #2 of 8 (1176 views)
Permalink
Re: [lvs-users] LVS and stale connections [In reply to]

-----Original Message-----
From: "Fabio Coatti" <cova [at] ferrara>
Sent: Tuesday, March 31, 2009 8:08am
To: lvs-users [at] linuxvirtualserver
Subject: [lvs-users] LVS and stale connections

>say, after reboot a couple (out of five) of servers shows with ipvsadm -L more
>than 1000 connections (and all of those are fake, as the Rservers have been
>rebooted) and ipvs doesn't pass any new connection until the count lowers
>enough and only 3 servers are actually working.


We've seen this problem too; the only way I've found to clear that counter is to pull a server from rotation and leave it pulled for 15-30 minutes.

--
Jason Faulkner
Linux Systems Engineer
Mailtrust, a division of Rackspace





_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


jmack at wm7d

Mar 31, 2009, 7:04 AM

Post #3 of 8 (1178 views)
Permalink
Re: [lvs-users] LVS and stale connections [In reply to]

On Tue, 31 Mar 2009, Fabio Coatti wrote:

> Basically, the scenario is the following: 5 servers (RIP),
> balanced with ipvs on linux 2.6.23, DR is used. All works
> just fine, but if for some reasons the real servers
> freezes and need reboot, ipvs keeps as "alive" all the
> connections so the balancing is not correctly done.

ipvs has no way of knowing that a machine has been rebooted,
so this is the expected behaviour. Health checking is all
external to ipvs. Since in LVS-DR the director doesn't see
the returning packets, it has to guess when connections
expire, and you've set a long timeout.

Either

o use the -SH scheduler to direct clients to the same
database server and return the timeouts to the default
period.

o use one of the ioctls that Horms wrote to handle crashed
realservers when ipvs is using persistence. These clear out
the connection table

Joe

--
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


malcolm at loadbalancer

Mar 31, 2009, 7:13 AM

Post #4 of 8 (1172 views)
Permalink
Re: [lvs-users] LVS and stale connections [In reply to]

I've never heard of this being a problem? 1000 doesn't sound like a lot of
stale connections?
Is their some more info on this?

I have customers doing 50K conns/second with no problems.




2009/3/31 <jason.faulkner [at] mailtrust>

> -----Original Message-----
> From: "Fabio Coatti" <cova [at] ferrara>
> Sent: Tuesday, March 31, 2009 8:08am
> To: lvs-users [at] linuxvirtualserver
> Subject: [lvs-users] LVS and stale connections
>
> >say, after reboot a couple (out of five) of servers shows with ipvsadm -L
> more
> >than 1000 connections (and all of those are fake, as the Rservers have
> been
> >rebooted) and ipvs doesn't pass any new connection until the count lowers
> >enough and only 3 servers are actually working.
>
>
> We've seen this problem too; the only way I've found to clear that counter
> is to pull a server from rotation and leave it pulled for 15-30 minutes.
>
> --
> Jason Faulkner
> Linux Systems Engineer
> Mailtrust, a division of Rackspace
>
>
>
>
>
> _______________________________________________
> Please read the documentation before posting - it's available at:
> http://www.linuxvirtualserver.org/
>
> LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
> Send requests to lvs-users-request [at] LinuxVirtualServer
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>



--
Regards,

Malcolm Turnbull.

Loadbalancer.org Ltd.
Phone: +44 (0)870 443 8779
http://www.loadbalancer.org/
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


malcolm at loadbalancer

Mar 31, 2009, 7:15 AM

Post #5 of 8 (1178 views)
Permalink
Re: [lvs-users] LVS and stale connections [In reply to]

Ignore me I didn't see the first post saying no health checker was being
used.


2009/3/31 Malcolm Turnbull <malcolm [at] loadbalancer>

> I've never heard of this being a problem? 1000 doesn't sound like a lot of
> stale connections?
> Is their some more info on this?
>
> I have customers doing 50K conns/second with no problems.
>
>
>
>
> 2009/3/31 <jason.faulkner [at] mailtrust>
>
> -----Original Message-----
>> From: "Fabio Coatti" <cova [at] ferrara>
>> Sent: Tuesday, March 31, 2009 8:08am
>> To: lvs-users [at] linuxvirtualserver
>> Subject: [lvs-users] LVS and stale connections
>>
>> >say, after reboot a couple (out of five) of servers shows with ipvsadm -L
>> more
>> >than 1000 connections (and all of those are fake, as the Rservers have
>> been
>> >rebooted) and ipvs doesn't pass any new connection until the count lowers
>> >enough and only 3 servers are actually working.
>>
>>
>> We've seen this problem too; the only way I've found to clear that counter
>> is to pull a server from rotation and leave it pulled for 15-30 minutes.
>>
>> --
>> Jason Faulkner
>> Linux Systems Engineer
>> Mailtrust, a division of Rackspace
>>
>>
>>
>>
>>
>> _______________________________________________
>> Please read the documentation before posting - it's available at:
>> http://www.linuxvirtualserver.org/
>>
>> LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
>> Send requests to lvs-users-request [at] LinuxVirtualServer
>> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>>
>
>
>
> --
> Regards,
>
> Malcolm Turnbull.
>
> Loadbalancer.org Ltd.
> Phone: +44 (0)870 443 8779
> http://www.loadbalancer.org/
>



--
Regards,

Malcolm Turnbull.

Loadbalancer.org Ltd.
Phone: +44 (0)870 443 8779
http://www.loadbalancer.org/
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


cova at ferrara

Mar 31, 2009, 8:52 AM

Post #6 of 8 (1175 views)
Permalink
Re: [lvs-users] LVS and stale connections [In reply to]

In data marted́ 31 marzo 2009 16:04:41, Joseph Mack NA3T ha scritto:
> On Tue, 31 Mar 2009, Fabio Coatti wrote:
> > Basically, the scenario is the following: 5 servers (RIP),
> > balanced with ipvs on linux 2.6.23, DR is used. All works
> > just fine, but if for some reasons the real servers
> > freezes and need reboot, ipvs keeps as "alive" all the
> > connections so the balancing is not correctly done.
>
> ipvs has no way of knowing that a machine has been rebooted,
> so this is the expected behaviour. Health checking is all
> external to ipvs. Since in LVS-DR the director doesn't see
> the returning packets, it has to guess when connections
> expire, and you've set a long timeout.

Yes, I know this part, and in fact I don't blame lvs for falling in this
situation; basically It would be useful a quick way to recover (by hand) when
something weird happens and lvs gets confused.

>
> Either
>
> o use the -SH scheduler to direct clients to the same
> database server and return the timeouts to the default
> period.

Hm, interesting, but I fear that this will lead to non-optimal situations when
from the same machine many connections are started.

>
> o use one of the ioctls that Horms wrote to handle crashed
> realservers when ipvs is using persistence. These clear out
> the connection table

This would be the good solution for my issue, indeed. Thanks for the
suggestion, with it I finally found some hints. If I read correctly the
documentation,

/proc/sys/net/ipv4/vs/expire_quiescent_template
and
/proc/sys/net/ipv4/vs/expire_nodest_conn

should help to solve the issue..

I'll try asap.

Thanks.




_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


cova at ferrara

Mar 31, 2009, 9:14 AM

Post #7 of 8 (1176 views)
Permalink
Re: [lvs-users] LVS and stale connections [In reply to]

In data marted́ 31 marzo 2009 16:15:55, Malcolm Turnbull ha scritto:
> Ignore me I didn't see the first post saying no health checker was being
> used.

Well, that's not completely true. This behaviour happened when real servers
become fairly unresponsive due to high load (a sudden load spike) and while
the machines was able to establish a TCP connection, the underlying service
was not able to serve the requests and the close the connection in a
reasonable amount of time.
So the healtcheck was unable to help in this case (and requests were coming
very quickly).
some servers were able to close some connections, even if under an high load,
thus reducing the connection count (and receiving all the new requests).
Rebooted servers were at 0 load but weren't receiving any connection due to
high connection count on lvs.
given the fact that no command on ipvs was able to restore the right
situation, we waited for timeouts to expire and gradually the situation
recovered. But it take a quite high amount of time.








_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


cova at ferrara

Mar 31, 2009, 9:57 AM

Post #8 of 8 (1177 views)
Permalink
Re: [lvs-users] LVS and stale connections [In reply to]

>
> > o use one of the ioctls that Horms wrote to handle crashed
> > realservers when ipvs is using persistence. These clear out
> > the connection table
>
> This would be the good solution for my issue, indeed. Thanks for the
> suggestion, with it I finally found some hints. If I read correctly the
> documentation,
>
> /proc/sys/net/ipv4/vs/expire_quiescent_template
> and
> /proc/sys/net/ipv4/vs/expire_nodest_conn
>
> should help to solve the issue..


Hm, looking better at the issue, on my setup LVS is not using persistent
connection, so probably this won't help.
IN fact, the issue is not related to any persistence, but to a timeout not
expiring on open connections (or better, on connection that lvs thinks to be
still opened)



_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Linux Virtual Server users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.