Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux Virtual Server: Users

[lvs-users] Weird problem after realserver crashes

 

 

Linux Virtual Server users RSS feed   Index | Next | Previous | View Threaded


janar.kartau at gmail

Apr 10, 2008, 3:18 AM

Post #1 of 4 (428 views)
Permalink
[lvs-users] Weird problem after realserver crashes

Hi,
We're using LVS-DR to loadbalance HTTP/HTTPS requests to three
webservers. Both active and backup LVS servers have fully updated CentOS
5.1. ARP problem is solved with arptables_jf. All this time i have been
restarting realservers during the night under very little load and s at
all. But lately one of the realservers crashed during the day and when
it came back it was automatically added back to the LVS and all but no
new requests were sent to it. Ipvsadm showed it had a lot of
ActiveConn's and zero InActConn's. These numbers remained the same for
10 or more minutes and then ActiveConn started decreasing slowly. Once
the ActiveConn was lower than the other realservers had, new requests
started to reach the server and InActConn increased from 0. I could
reproduce this later when i took a realserver down myself and noticed
that the more connections there were during the crash and after, the
bigger static count of ActiveConn's appeared for the crashed server once
it came back. Neither LVS restart or "ipvsadm --zero" helped.
It seems to me that when one of the realservers disappears, LVS doesn't
close the open connections and they just hang there until timeout comes.
Definately doesn't seem like an ARP prolem.
Oh.. and i should mention that i'm using firewall marks and lc scheduler.

Any help would be appreciated!

Thanks,
Janar Kartau

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users[at]LinuxVirtualServer.org
Send requests to lvs-users-request[at]LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


jmack at wm7d

Apr 10, 2008, 6:56 AM

Post #2 of 4 (403 views)
Permalink
Re: [lvs-users] Weird problem after realserver crashes [In reply to]

On Thu, 10 Apr 2008, Janar Kartau wrote:

> But lately one of the realservers crashed during the day
> and when it came back it was automatically added back to
> the LVS and all but no new requests were sent to it.
> Ipvsadm showed it had a lot of ActiveConn's and zero
> InActConn's.

I'm surprised that we haven't heard about this as a problem
before. A realserver crashing must happen often enough that
someone else has already seen this.

> These numbers remained the same for 10 or more minutes and
> then ActiveConn started decreasing slowly.

I thought the timeout were about 2mins. Would changing them
to 2mins help (it's one of the options to ipvsadm)?

> Once the ActiveConn was lower than the other realservers
> had, new requests started to reach the server and
> InActConn increased from 0. I could reproduce this later
> when i took a realserver down myself and noticed that the
> more connections there were during the crash and after,
> the bigger static count of ActiveConn's appeared for the
> crashed server once it came back. Neither LVS restart or
> "ipvsadm --zero" helped.

ipvs keeps its state tables so it doesn't mess with any
ESTABLISHED connections.

Joe
--
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users[at]LinuxVirtualServer.org
Send requests to lvs-users-request[at]LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


janar.kartau at gmail

May 8, 2008, 7:13 PM

Post #3 of 4 (298 views)
Permalink
Re: [lvs-users] Weird problem after realserver crashes [In reply to]

The problem was indeed in the TCP session timeout settings. It defaults
to 900 seconds in CentOS 5. Connections made between the realserver
crash and pulse removing the node from the LVS config would remain in
ESTABLISHED state since director never got a CLOSE. As a result, these
dead connections remained in the ipvs table for whole 15 minutes and
thus making the realserver useless until the connections finally timed
out. I'm surprised nobody had this problem before.

Janar Kartau

Joseph Mack NA3T wrote:
> On Thu, 10 Apr 2008, Janar Kartau wrote:
>
>
>> But lately one of the realservers crashed during the day
>> and when it came back it was automatically added back to
>> the LVS and all but no new requests were sent to it.
>> Ipvsadm showed it had a lot of ActiveConn's and zero
>> InActConn's.
>>
>
> I'm surprised that we haven't heard about this as a problem
> before. A realserver crashing must happen often enough that
> someone else has already seen this.
>
>
>> These numbers remained the same for 10 or more minutes and
>> then ActiveConn started decreasing slowly.
>>
>
> I thought the timeout were about 2mins. Would changing them
> to 2mins help (it's one of the options to ipvsadm)?
>
>
>> Once the ActiveConn was lower than the other realservers
>> had, new requests started to reach the server and
>> InActConn increased from 0. I could reproduce this later
>> when i took a realserver down myself and noticed that the
>> more connections there were during the crash and after,
>> the bigger static count of ActiveConn's appeared for the
>> crashed server once it came back. Neither LVS restart or
>> "ipvsadm --zero" helped.
>>
>
> ipvs keeps its state tables so it doesn't mess with any
> ESTABLISHED connections.
>
> Joe
>


_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users[at]LinuxVirtualServer.org
Send requests to lvs-users-request[at]LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


graeme at graemef

May 9, 2008, 1:07 AM

Post #4 of 4 (293 views)
Permalink
Re: [lvs-users] Weird problem after realserver crashes [In reply to]

On Fri, 2008-05-09 at 05:13 +0300, Janar Kartau wrote:
> The problem was indeed in the TCP session timeout settings. It defaults
> to 900 seconds in CentOS 5. Connections made between the realserver
> crash and pulse removing the node from the LVS config would remain in
> ESTABLISHED state since director never got a CLOSE. As a result, these
> dead connections remained in the ipvs table for whole 15 minutes and
> thus making the realserver useless until the connections finally timed
> out. I'm surprised nobody had this problem before.

This is what the expire_nodest_conn and expire_quiescent_template
sysctls are for. Toggling them to "on" respectively removes the IPVS
table entries when the realserver is removed from the pool, or removes
any persistence templates when weight = 0.

Graeme


_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users[at]LinuxVirtualServer.org
Send requests to lvs-users-request[at]LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Linux Virtual Server users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.