Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux Virtual Server: Users

[lvs-users] Problem with ghost connections

 

 

Linux Virtual Server users RSS feed   Index | Next | Previous | View Threaded


r.laban at ism

Nov 6, 2007, 7:30 AM

Post #1 of 8 (920 views)
Permalink
[lvs-users] Problem with ghost connections

Hello list,

I'm having some odd issues with our loadbalancers. After a longer (don't have
any exact numbers) period of non-interrupted uptime, the counters for our
various real/virtual servers seem to get 'stuck'. The loadbalancer is running
Suse Linux Enterprise Server 9 SP3:
# uname -a
Linux ismlnx-lb06 2.6.5-7.286-smp #1 SMP Thu May 31 10:12:58 UTC 2007 x86_64
x86_64 x86_64 GNU/Linux
# ipvsadm -v
ipvsadm v1.24 2003/06/07 (compiled with getopt_long and IPVS v1.2.0)

When I just checked the output of ipvsadm -L -n, I noticed that my
workstation's ip address was listed various times, even though I hadn't
access one of our loadbalanced sites in a while. Here's some details on
what's happening .I verified that there was no traffic flowing between my
workstation and the loadbalancer. The counters seem to get reset when they
hit 0, which shouldn't be the case I'd say. Is this a known
problem/bug/configuration issue? I'm using heartbeat+ldirectord (version
1.2.3) to control our loadbalance configurations.

# while true ; do ipvsadm -L -nc | grep 10.0.0.66 ; echo ; echo === ; echo ;
sleep 10 ; done
TCP 00:36 FIN_WAIT 10.0.0.66:1929 213.247.48.203:80 127.0.0.1:80
TCP 00:50 FIN_WAIT 10.0.0.66:1991 213.247.48.203:80 127.0.0.1:80
TCP 00:05 FIN_WAIT 10.0.0.66:1964 213.247.48.203:80 127.0.0.1:80
IP 00:46 ERR! 10.0.0.66:0 0.0.0.203:0 213.247.48.25:0
IP 00:33 ERR! 10.0.0.66:0 0.0.0.203:0 127.0.0.1:0
TCP 00:05 ESTABLISHED 10.0.0.66:2443 213.247.48.203:80 213.247.48.25:80

===

TCP 00:23 FIN_WAIT 10.0.0.66:1929 213.247.48.203:80 127.0.0.1:80
TCP 00:38 FIN_WAIT 10.0.0.66:1991 213.247.48.203:80 127.0.0.1:80
TCP 00:52 FIN_WAIT 10.0.0.66:1964 213.247.48.203:80 127.0.0.1:80
IP 00:33 ERR! 10.0.0.66:0 0.0.0.203:0 213.247.48.25:0
IP 00:20 ERR! 10.0.0.66:0 0.0.0.203:0 127.0.0.1:0
TCP 00:52 ESTABLISHED 10.0.0.66:2443 213.247.48.203:80 213.247.48.25:80

===

TCP 00:10 FIN_WAIT 10.0.0.66:1929 213.247.48.203:80 127.0.0.1:80
TCP 00:25 FIN_WAIT 10.0.0.66:1991 213.247.48.203:80 127.0.0.1:80
TCP 00:39 FIN_WAIT 10.0.0.66:1964 213.247.48.203:80 127.0.0.1:80
IP 00:20 ERR! 10.0.0.66:0 0.0.0.203:0 213.247.48.25:0
IP 00:07 ERR! 10.0.0.66:0 0.0.0.203:0 127.0.0.1:0
TCP 00:40 ESTABLISHED 10.0.0.66:2443 213.247.48.203:80 213.247.48.25:80

===

TCP 00:57 FIN_WAIT 10.0.0.66:1929 213.247.48.203:80 127.0.0.1:80
TCP 00:12 FIN_WAIT 10.0.0.66:1991 213.247.48.203:80 127.0.0.1:80
TCP 00:26 FIN_WAIT 10.0.0.66:1964 213.247.48.203:80 127.0.0.1:80
IP 00:07 ERR! 10.0.0.66:0 0.0.0.203:0 213.247.48.25:0
IP 00:54 ERR! 10.0.0.66:0 0.0.0.203:0 127.0.0.1:0
TCP 00:27 ESTABLISHED 10.0.0.66:2443 213.247.48.203:80 213.247.48.25:80

Extra info: we do have a FWM #203 configured (pointing to 213.247.48.24 &
213.247.48.25), but (since quite a while) not for 213.247.48.203.

Kind regards,
--
Ruben Laban
Systems and Network Administrator
r.laban [at] ism

ISM eCompany
Van Nelleweg 1
Postbus 13043
3004 HA Rotterdam
+31 (0)10 243 6000 (tel)
+31 (0)10 243 6066 (fax)
www.ism.nl

Quality Solutions - Reliable Partner

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


r.laban at ism

Nov 9, 2007, 5:35 AM

Post #2 of 8 (864 views)
Permalink
Re: [lvs-users] Problem with ghost connections [In reply to]

Hello list,

No ideas on the issue stated below as of yet?

I'll try to reduce my question to a rather simple one:

What could cause entries in the IPVS table to remain in the established (or
any other) state even though there is no traffic flowing between the
CIP/VIP/RIP?
We're using persistency by the way. Though from what I've read, the given
persistency timeout is only used once when a new connection arrives.

Regards,
Ruben

On Tuesday 06 November 2007, Ruben Laban wrote:
> Hello list,
>
> I'm having some odd issues with our loadbalancers. After a longer (don't
> have any exact numbers) period of non-interrupted uptime, the counters for
> our various real/virtual servers seem to get 'stuck'. The loadbalancer is
> running Suse Linux Enterprise Server 9 SP3:
> # uname -a
> Linux ismlnx-lb06 2.6.5-7.286-smp #1 SMP Thu May 31 10:12:58 UTC 2007
> x86_64 x86_64 x86_64 GNU/Linux
> # ipvsadm -v
> ipvsadm v1.24 2003/06/07 (compiled with getopt_long and IPVS v1.2.0)
>
> When I just checked the output of ipvsadm -L -n, I noticed that my
> workstation's ip address was listed various times, even though I hadn't
> access one of our loadbalanced sites in a while. Here's some details on
> what's happening .I verified that there was no traffic flowing between my
> workstation and the loadbalancer. The counters seem to get reset when they
> hit 0, which shouldn't be the case I'd say. Is this a known
> problem/bug/configuration issue? I'm using heartbeat+ldirectord (version
> 1.2.3) to control our loadbalance configurations.
>
> # while true ; do ipvsadm -L -nc | grep 10.0.0.66 ; echo ; echo === ; echo
> ; sleep 10 ; done
> TCP 00:36 FIN_WAIT 10.0.0.66:1929 213.247.48.203:80 127.0.0.1:80
> TCP 00:50 FIN_WAIT 10.0.0.66:1991 213.247.48.203:80 127.0.0.1:80
> TCP 00:05 FIN_WAIT 10.0.0.66:1964 213.247.48.203:80 127.0.0.1:80
> IP 00:46 ERR! 10.0.0.66:0 0.0.0.203:0
> 213.247.48.25:0 IP 00:33 ERR! 10.0.0.66:0 0.0.0.203:0
> 127.0.0.1:0 TCP 00:05 ESTABLISHED 10.0.0.66:2443 213.247.48.203:80
> 213.247.48.25:80
>
> ===
>
> TCP 00:23 FIN_WAIT 10.0.0.66:1929 213.247.48.203:80 127.0.0.1:80
> TCP 00:38 FIN_WAIT 10.0.0.66:1991 213.247.48.203:80 127.0.0.1:80
> TCP 00:52 FIN_WAIT 10.0.0.66:1964 213.247.48.203:80 127.0.0.1:80
> IP 00:33 ERR! 10.0.0.66:0 0.0.0.203:0
> 213.247.48.25:0 IP 00:20 ERR! 10.0.0.66:0 0.0.0.203:0
> 127.0.0.1:0 TCP 00:52 ESTABLISHED 10.0.0.66:2443 213.247.48.203:80
> 213.247.48.25:80
>
> ===
>
> TCP 00:10 FIN_WAIT 10.0.0.66:1929 213.247.48.203:80 127.0.0.1:80
> TCP 00:25 FIN_WAIT 10.0.0.66:1991 213.247.48.203:80 127.0.0.1:80
> TCP 00:39 FIN_WAIT 10.0.0.66:1964 213.247.48.203:80 127.0.0.1:80
> IP 00:20 ERR! 10.0.0.66:0 0.0.0.203:0
> 213.247.48.25:0 IP 00:07 ERR! 10.0.0.66:0 0.0.0.203:0
> 127.0.0.1:0 TCP 00:40 ESTABLISHED 10.0.0.66:2443 213.247.48.203:80
> 213.247.48.25:80
>
> ===
>
> TCP 00:57 FIN_WAIT 10.0.0.66:1929 213.247.48.203:80 127.0.0.1:80
> TCP 00:12 FIN_WAIT 10.0.0.66:1991 213.247.48.203:80 127.0.0.1:80
> TCP 00:26 FIN_WAIT 10.0.0.66:1964 213.247.48.203:80 127.0.0.1:80
> IP 00:07 ERR! 10.0.0.66:0 0.0.0.203:0
> 213.247.48.25:0 IP 00:54 ERR! 10.0.0.66:0 0.0.0.203:0
> 127.0.0.1:0 TCP 00:27 ESTABLISHED 10.0.0.66:2443 213.247.48.203:80
> 213.247.48.25:80
>
> Extra info: we do have a FWM #203 configured (pointing to 213.247.48.24 &
> 213.247.48.25), but (since quite a while) not for 213.247.48.203.



_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


jmack at wm7d

Nov 9, 2007, 6:52 AM

Post #3 of 8 (861 views)
Permalink
Re: [lvs-users] Problem with ghost connections [In reply to]

On Fri, 9 Nov 2007, Ruben Laban wrote:

> Hello list,
>
> No ideas on the issue stated below as of yet?

apparently not (just so you don't think you're being
ignored, sometimes we don't have an answer).

> What could cause entries in the IPVS table to remain in
> the established (or any other) state even though there is
> no traffic flowing between the CIP/VIP/RIP?

neither end has closed the tcp connection?

> We're using persistency by the way. Though from what I've
> read, the given persistency timeout is only used once when
> a new connection arrives.

there are lots of problems with persistency, but it was the
first method we got to work for maintaining
client-realserver affinity. Another less intrusive method is
the -SH scheduler. You might find it causes less problems.

Joe
--
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


r.laban at ism

Nov 9, 2007, 7:09 AM

Post #4 of 8 (865 views)
Permalink
Re: [lvs-users] Problem with ghost connections [In reply to]

On Friday 09 November 2007, Joseph Mack NA3T wrote:
> On Fri, 9 Nov 2007, Ruben Laban wrote:
> > Hello list,
> >
> > No ideas on the issue stated below as of yet?
>
> apparently not (just so you don't think you're being
> ignored, sometimes we don't have an answer).

I'm aware of that. I'm no fan of 'bumping' threads to get attention in
general, but this one has been bugging me for quite some time now.

> > What could cause entries in the IPVS table to remain in
> > the established (or any other) state even though there is
> > no traffic flowing between the CIP/VIP/RIP?
>
> neither end has closed the tcp connection?

That's one very likely scenario. We're using LVS-DR so the director only sees
half the traffic. Except you'd think (or atleast I do) that after a given
timeout period, the entry would be purged eventually. I just checked the IPVS
table on our backup loadbalancer, and it had ~125000 entries! Of which ~6000
with state ESTABLISHED. The last failover occured october, 23rd. I'd think
those entries should have expired by now?

> > We're using persistency by the way. Though from what I've
> > read, the given persistency timeout is only used once when
> > a new connection arrives.
>
> there are lots of problems with persistency, but it was the
> first method we got to work for maintaining
> client-realserver affinity. Another less intrusive method is
> the -SH scheduler. You might find it causes less problems.

I'm not fond of having to enable persistency on our loadbalancers, but the web
application that's behind it, requires it, unfortunately.
I've looked into the -SH scheduler as well a while ago. But I couldn't find
enough information on it. Especially the behaviour when one realserver goes
down. Will the hashing scheme cause all connection to be kicked to another
realserver because of the redistribution of the hash due to the number of
realservers changing?

Regards
--
Ruben

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


jmack at wm7d

Nov 9, 2007, 7:31 AM

Post #5 of 8 (858 views)
Permalink
Re: [lvs-users] Problem with ghost connections [In reply to]

On Fri, 9 Nov 2007, Ruben Laban wrote:

> I'm aware of that. I'm no fan of 'bumping' threads to get attention in
> general, but this one has been bugging me for quite some time now.

yes it's like a thorn in a dog's paw.

>> neither end has closed the tcp connection?
>
> That's one very likely scenario. We're using LVS-DR so the
> director only sees half the traffic. Except you'd think
> (or atleast I do) that after a given timeout period, the
> entry would be purged eventually. I just checked the IPVS
> table on our backup loadbalancer, and it had ~125000
> entries! Of which ~6000 with state ESTABLISHED. The last
> failover occured october, 23rd. I'd think those entries
> should have expired by now?

I would have thought so too. Are you using our ip_vs module
or the SuSE market enhanced ip_vs? They may be the same, but
if they're different, we don't know what they did to it. I
would expect if this was a regular bug someone else would
have seen it too.

> I'm not fond of having to enable persistency on our
> loadbalancers, but the web application that's behind it,
> requires it, unfortunately. I've looked into the -SH
> scheduler as well a while ago. But I couldn't find enough
> information on it.

there's an example setup from someone who's using it to
satisfy affinity. Presumably you've found that.

> Especially the behaviour when one realserver goes
> down.

undefined for the current connection. Presumably tcpip will
drop the connection eventually. What the application does is
something else. It's whatever happens if you pull the
ethernet cable on that connection.

> Will the hashing scheme cause all connection to be kicked
> to another realserver because of the redistribution of the
> hash due to the number of realservers changing?

the failover scripts will re-run ipvsadm without the failed
realserver

Joe

--
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


r.laban at ism

Nov 9, 2007, 7:38 AM

Post #6 of 8 (864 views)
Permalink
Re: [lvs-users] Problem with ghost connections [In reply to]

On Friday 09 November 2007, Ruben Laban wrote:
> > > What could cause entries in the IPVS table to remain in
> > > the established (or any other) state even though there is
> > > no traffic flowing between the CIP/VIP/RIP?
> >
> > neither end has closed the tcp connection?
>
> That's one very likely scenario. We're using LVS-DR so the director only
> sees half the traffic. Except you'd think (or atleast I do) that after a
> given timeout period, the entry would be purged eventually. I just checked
> the IPVS table on our backup loadbalancer, and it had ~125000 entries! Of
> which ~6000 with state ESTABLISHED. The last failover occured october,
> 23rd. I'd think those entries should have expired by now?

Just tried to flush those entries by unloading the kernel modules. So I issued
a 'rmmod ip_vs_wlc ip_vs_wrr ip_vs'. It unloaded the first two modules just
fine, but now the machine is hanging on unloading the "ip_vs" module. Either
this is not supported or there's something wrong with our installation. The
rmmod process is using 99.9% (on a HT enabled system, so the totals in 'top'
sho 50% idle and 50% system interrupts). Is there a better way of flushing
the IPVS table?

Regards,
--
Ruben

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


jmack at wm7d

Nov 9, 2007, 8:36 AM

Post #7 of 8 (861 views)
Permalink
Re: [lvs-users] Problem with ghost connections [In reply to]

On Fri, 9 Nov 2007, Ruben Laban wrote:

> Is there a better way of flushing
> the IPVS table?

for persistence there's a flush sysctl which breaks all
connections. It's in the HOWTO

Joe

--
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


r.laban at ism

Nov 12, 2007, 12:05 AM

Post #8 of 8 (856 views)
Permalink
Re: [lvs-users] Problem with ghost connections [In reply to]

On Friday 09 November 2007, Joseph Mack NA3T wrote:
> On Fri, 9 Nov 2007, Ruben Laban wrote:
> > Is there a better way of flushing
> > the IPVS table?
>
> for persistence there's a flush sysctl which breaks all
> connections. It's in the HOWTO

Then I guess I'm out of luck for now. The sysctls mentioned in the HOWTO
aren't present on my boxes. Let's just hope they survive it through the
holidays without too much problems (other than me having to explain to the
various managers that I can't provide them reliable numbers of connections),
after which I can go experiment with newer software.

Thanks for the pointers.

Regards,
--
Ruben

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Linux Virtual Server users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.