Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux Virtual Server: Users

[lvs-users] failover and connection threshold

 

 

Linux Virtual Server users RSS feed   Index | Next | Previous | View Threaded


rumen at voicecho

Oct 25, 2007, 6:57 AM

Post #1 of 7 (391 views)
Permalink
[lvs-users] failover and connection threshold

Hallo all,
I am experiencing problems with the LVS master and backup daemons.
I have 2 directors running both with master and backup sync daemons.
Failover works fine but the connection threshold does not. I mean if I
have 2 real servers each accepting 3 connections, this means I can have
6 connections in total to the cluster, but if I have 5 and the master
fails, the backup takes over(so far so good) but the new director
accepts 6 more connections and the cluster ends up with 11 connections,
if another failover occurs soon, 6 more connections will be accepted no
matter how many were inherited from the failed director.

How can I make the new director know how many connections are inherited?

"ipvsadm -lnc" shows the inherited connections together with the new
ones, while "ipvsadm -l" shows only the connections established by the
current director.

Is it a bug or a feature? Can I make the threshold work correctly with
the failover?


Rumen



_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users[at]LinuxVirtualServer.org
Send requests to lvs-users-request[at]LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


jmack at wm7d

Oct 25, 2007, 7:22 AM

Post #2 of 7 (373 views)
Permalink
Re: [lvs-users] failover and connection threshold [In reply to]

On Thu, 25 Oct 2007, Rumen Bogdanovski wrote:

> I am experiencing problems with the LVS master and backup
> daemons. I have 2 directors running both with master and
> backup sync daemons. Failover works fine but the
> connection threshold does not. I mean if I have 2 real
> servers each accepting 3 connections, this means I can
> have 6 connections in total to the cluster, but if I have
> 5 and the master fails, the backup takes over(so far so
> good) but the new director accepts 6 more connections and
> the cluster ends up with 11 connections, if another
> failover occurs soon, 6 more connections will be accepted
> no matter how many were inherited from the failed
> director.

hmm. I'm not familiar with the connection threshhold code.
As well it doesn't get used a whole lot, so it's possible
that there are unnoticed bugs. However the code was written
(I think) by Ratz (unless someone has messed with it since)
and Ratz is unlikely to have let code out without making the
obvious test to check for the problem you see.

Hopefully someone else will have a suggestion for you.

> How can I make the new director know how many connections are inherited?
>
> "ipvsadm -lnc" shows the inherited connections together with the new
> ones, while "ipvsadm -l" shows only the connections established by the
> current director.

Has the number of connections (and other state info) been
transferred by the synch state demon to the backup director?

Joe

--
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users[at]LinuxVirtualServer.org
Send requests to lvs-users-request[at]LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


rumen at voicecho

Oct 25, 2007, 7:42 AM

Post #3 of 7 (375 views)
Permalink
Re: [lvs-users] failover and connection threshold [In reply to]

On Thu, 2007-10-25 at 07:22 -0700, Joseph Mack NA3T wrote:
> On Thu, 25 Oct 2007, Rumen Bogdanovski wrote:
>
> > I am experiencing problems with the LVS master and backup
> > daemons. I have 2 directors running both with master and
> > backup sync daemons. Failover works fine but the
> > connection threshold does not. I mean if I have 2 real
> > servers each accepting 3 connections, this means I can
> > have 6 connections in total to the cluster, but if I have
> > 5 and the master fails, the backup takes over(so far so
> > good) but the new director accepts 6 more connections and
> > the cluster ends up with 11 connections, if another
> > failover occurs soon, 6 more connections will be accepted
> > no matter how many were inherited from the failed
> > director.
>
> hmm. I'm not familiar with the connection threshhold code.
> As well it doesn't get used a whole lot, so it's possible
> that there are unnoticed bugs. However the code was written
> (I think) by Ratz (unless someone has messed with it since)
> and Ratz is unlikely to have let code out without making the
> obvious test to check for the problem you see.
>
> Hopefully someone else will have a suggestion for you.
>
> > How can I make the new director know how many connections are inherited?
> >
> > "ipvsadm -lnc" shows the inherited connections together with the new
> > ones, while "ipvsadm -l" shows only the connections established by the
> > current director.
>
> Has the number of connections (and other state info) been
> transferred by the synch state demon to the backup director?

Well the connection state works fine, no connection is dropped on when
failover occurs. "ipvsadm -lnc" shows the correct state of all
connections.
but "ipvsadm -l" says
root[at]test2:~# ipvsadm -l
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP rumen-desktop.local:5999 wlc
-> node473.local:5999 Route 1000 0 0
-> node484.local:5999 Route 1000 0 0

while "ipvs -lnc"
root[at]test2:~# ipvsadm -lnc
IPVS connection entries
pro expire state source virtual destination
TCP 14:56 ESTABLISHED 192.168.0.10:39164 192.168.0.222:5999 192.168.0.51:5999
TCP 14:59 ESTABLISHED 192.168.0.10:39165 192.168.0.222:5999 192.168.0.52:5999

new connection created :
root[at]rumen-desktop:~# ipvsadm -l
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP rumen-desktop.local:5999 wlc
-> node491.local:5999 Route 999 1 0
-> node503.local:5999 Route 999 0 0

root[at]rumen-desktop:~# ipvsadm -lnc
IPVS connection entries
pro expire state source virtual destination
TCP 14:59 ESTABLISHED 192.168.0.10:32800 192.168.0.222:5999
192.168.0.52:5999
TCP 14:21 ESTABLISHED 192.168.0.10:39164 192.168.0.222:5999
192.168.0.51:5999
TCP 14:31 ESTABLISHED 192.168.0.10:39165 192.168.0.222:5999
192.168.0.52:5999

So the new connection is seen on both places while the old ones only
with "-lnc" and the scheduler seems to read the same number of
connections as "ipvsadm -l"

I could look in the source, but I am not sure how much time it will take
me to figure out how everything works and to fix it... However I will
try next days when I have some time...

Rumen


_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users[at]LinuxVirtualServer.org
Send requests to lvs-users-request[at]LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


rumen at voicecho

Oct 25, 2007, 5:24 PM

Post #4 of 7 (371 views)
Permalink
Re: [lvs-users] failover and connection threshold [In reply to]

I think I can fix the problem,
Can anybody tell me how to compile only ip_vs.ko, not the whole kernel?
If it works I will post a the patch.

Regards
Rumen

On Thu, 2007-10-25 at 17:42 +0300, Rumen Bogdanovski wrote:
> On Thu, 2007-10-25 at 07:22 -0700, Joseph Mack NA3T wrote:
> > On Thu, 25 Oct 2007, Rumen Bogdanovski wrote:
> >
> > > I am experiencing problems with the LVS master and backup
> > > daemons. I have 2 directors running both with master and
> > > backup sync daemons. Failover works fine but the
> > > connection threshold does not. I mean if I have 2 real
> > > servers each accepting 3 connections, this means I can
> > > have 6 connections in total to the cluster, but if I have
> > > 5 and the master fails, the backup takes over(so far so
> > > good) but the new director accepts 6 more connections and
> > > the cluster ends up with 11 connections, if another
> > > failover occurs soon, 6 more connections will be accepted
> > > no matter how many were inherited from the failed
> > > director.
> >
> > hmm. I'm not familiar with the connection threshhold code.
> > As well it doesn't get used a whole lot, so it's possible
> > that there are unnoticed bugs. However the code was written
> > (I think) by Ratz (unless someone has messed with it since)
> > and Ratz is unlikely to have let code out without making the
> > obvious test to check for the problem you see.
> >
> > Hopefully someone else will have a suggestion for you.
> >
> > > How can I make the new director know how many connections are inherited?
> > >
> > > "ipvsadm -lnc" shows the inherited connections together with the new
> > > ones, while "ipvsadm -l" shows only the connections established by the
> > > current director.
> >
> > Has the number of connections (and other state info) been
> > transferred by the synch state demon to the backup director?
>
> Well the connection state works fine, no connection is dropped on when
> failover occurs. "ipvsadm -lnc" shows the correct state of all
> connections.
> but "ipvsadm -l" says
> root[at]test2:~# ipvsadm -l
> IP Virtual Server version 1.2.1 (size=4096)
> Prot LocalAddress:Port Scheduler Flags
> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
> TCP rumen-desktop.local:5999 wlc
> -> node473.local:5999 Route 1000 0 0
> -> node484.local:5999 Route 1000 0 0
>
> while "ipvs -lnc"
> root[at]test2:~# ipvsadm -lnc
> IPVS connection entries
> pro expire state source virtual destination
> TCP 14:56 ESTABLISHED 192.168.0.10:39164 192.168.0.222:5999 192.168.0.51:5999
> TCP 14:59 ESTABLISHED 192.168.0.10:39165 192.168.0.222:5999 192.168.0.52:5999
>
> new connection created :
> root[at]rumen-desktop:~# ipvsadm -l
> IP Virtual Server version 1.2.1 (size=4096)
> Prot LocalAddress:Port Scheduler Flags
> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
> TCP rumen-desktop.local:5999 wlc
> -> node491.local:5999 Route 999 1 0
> -> node503.local:5999 Route 999 0 0
>
> root[at]rumen-desktop:~# ipvsadm -lnc
> IPVS connection entries
> pro expire state source virtual destination
> TCP 14:59 ESTABLISHED 192.168.0.10:32800 192.168.0.222:5999
> 192.168.0.52:5999
> TCP 14:21 ESTABLISHED 192.168.0.10:39164 192.168.0.222:5999
> 192.168.0.51:5999
> TCP 14:31 ESTABLISHED 192.168.0.10:39165 192.168.0.222:5999
> 192.168.0.52:5999
>
> So the new connection is seen on both places while the old ones only
> with "-lnc" and the scheduler seems to read the same number of
> connections as "ipvsadm -l"
>
> I could look in the source, but I am not sure how much time it will take
> me to figure out how everything works and to fix it... However I will
> try next days when I have some time...
>
> Rumen
>
>
> _______________________________________________
> LinuxVirtualServer.org mailing list - lvs-users[at]LinuxVirtualServer.org
> Send requests to lvs-users-request[at]LinuxVirtualServer.org
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users


_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users[at]LinuxVirtualServer.org
Send requests to lvs-users-request[at]LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


jmack at wm7d

Oct 25, 2007, 5:33 PM

Post #5 of 7 (375 views)
Permalink
Re: [lvs-users] failover and connection threshold [In reply to]

On Fri, 26 Oct 2007, Rumen Bogdanovski wrote:

> I think I can fix the problem,
> Can anybody tell me how to compile only ip_vs.ko, not the whole kernel?
> If it works I will post a the patch.

if the only changes are to the ip_vs code in your kernel
tree

make modules && make modules_install

or (I think) you go to the directory with the ipvs code and
try these

make
make modules && make modules_install

Joe

--
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users[at]LinuxVirtualServer.org
Send requests to lvs-users-request[at]LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


rumen at voicecho

Oct 26, 2007, 5:29 PM

Post #6 of 7 (363 views)
Permalink
Re: [lvs-users] failover and connection threshold [In reply to]

Hi All
I have managed to fix the reported by me problem.
Tomorrow I will test it extensively. And I will post a patch.

To whom I have to send the path in order to be incorporated in the
mainstream source?

Rumen

On Thu, 2007-10-25 at 17:42 +0300, Rumen Bogdanovski wrote:
> On Thu, 2007-10-25 at 07:22 -0700, Joseph Mack NA3T wrote:
> > On Thu, 25 Oct 2007, Rumen Bogdanovski wrote:
> >
> > > I am experiencing problems with the LVS master and backup
> > > daemons. I have 2 directors running both with master and
> > > backup sync daemons. Failover works fine but the
> > > connection threshold does not. I mean if I have 2 real
> > > servers each accepting 3 connections, this means I can
> > > have 6 connections in total to the cluster, but if I have
> > > 5 and the master fails, the backup takes over(so far so
> > > good) but the new director accepts 6 more connections and
> > > the cluster ends up with 11 connections, if another
> > > failover occurs soon, 6 more connections will be accepted
> > > no matter how many were inherited from the failed
> > > director.
> >
> > hmm. I'm not familiar with the connection threshhold code.
> > As well it doesn't get used a whole lot, so it's possible
> > that there are unnoticed bugs. However the code was written
> > (I think) by Ratz (unless someone has messed with it since)
> > and Ratz is unlikely to have let code out without making the
> > obvious test to check for the problem you see.
> >
> > Hopefully someone else will have a suggestion for you.
> >
> > > How can I make the new director know how many connections are inherited?
> > >
> > > "ipvsadm -lnc" shows the inherited connections together with the new
> > > ones, while "ipvsadm -l" shows only the connections established by the
> > > current director.
> >
> > Has the number of connections (and other state info) been
> > transferred by the synch state demon to the backup director?
>
> Well the connection state works fine, no connection is dropped on when
> failover occurs. "ipvsadm -lnc" shows the correct state of all
> connections.
> but "ipvsadm -l" says
> root[at]test2:~# ipvsadm -l
> IP Virtual Server version 1.2.1 (size=4096)
> Prot LocalAddress:Port Scheduler Flags
> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
> TCP rumen-desktop.local:5999 wlc
> -> node473.local:5999 Route 1000 0 0
> -> node484.local:5999 Route 1000 0 0
>
> while "ipvs -lnc"
> root[at]test2:~# ipvsadm -lnc
> IPVS connection entries
> pro expire state source virtual destination
> TCP 14:56 ESTABLISHED 192.168.0.10:39164 192.168.0.222:5999 192.168.0.51:5999
> TCP 14:59 ESTABLISHED 192.168.0.10:39165 192.168.0.222:5999 192.168.0.52:5999
>
> new connection created :
> root[at]rumen-desktop:~# ipvsadm -l
> IP Virtual Server version 1.2.1 (size=4096)
> Prot LocalAddress:Port Scheduler Flags
> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
> TCP rumen-desktop.local:5999 wlc
> -> node491.local:5999 Route 999 1 0
> -> node503.local:5999 Route 999 0 0
>
> root[at]rumen-desktop:~# ipvsadm -lnc
> IPVS connection entries
> pro expire state source virtual destination
> TCP 14:59 ESTABLISHED 192.168.0.10:32800 192.168.0.222:5999
> 192.168.0.52:5999
> TCP 14:21 ESTABLISHED 192.168.0.10:39164 192.168.0.222:5999
> 192.168.0.51:5999
> TCP 14:31 ESTABLISHED 192.168.0.10:39165 192.168.0.222:5999
> 192.168.0.52:5999
>
> So the new connection is seen on both places while the old ones only
> with "-lnc" and the scheduler seems to read the same number of
> connections as "ipvsadm -l"
>
> I could look in the source, but I am not sure how much time it will take
> me to figure out how everything works and to fix it... However I will
> try next days when I have some time...
>
> Rumen
>
>
> _______________________________________________
> LinuxVirtualServer.org mailing list - lvs-users[at]LinuxVirtualServer.org
> Send requests to lvs-users-request[at]LinuxVirtualServer.org
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users


_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users[at]LinuxVirtualServer.org
Send requests to lvs-users-request[at]LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


rumen at voicecho

Oct 27, 2007, 5:00 AM

Post #7 of 7 (363 views)
Permalink
Re: [lvs-users] failover and connection threshold - fix [In reply to]

Sorry, forgot to send a copy to the lvs-users list so here it is :)

Hallo all,

This patch fixes the problem with node overload on director fail-over.
Given the scenario: 2 nodes each accepting 3 connections at a time and 2
directors, director failover occurs when the nodes are fully loaded (6
connections to the cluster) in this case the new director will assign
another 6 connections to the cluster, If the same real servers exist
there.

The he problem turned to be in not binding the inherited connections to
the real servers (destinations) on the backup director. Therefore:
"ipvsadm -l" reports 0 connections:
root[at]test2:~# ipvsadm -l
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP test2.local:5999 wlc
-> node473.local:5999 Route 1000 0 0
-> node484.local:5999 Route 1000 0 0

while "ipvs -lnc" is right
root[at]test2:~# ipvsadm -lnc
IPVS connection entries
pro expire state source virtual destination
TCP 14:56 ESTABLISHED 192.168.0.10:39164 192.168.0.222:5999
192.168.0.51:5999
TCP 14:59 ESTABLISHED 192.168.0.10:39165 192.168.0.222:5999
192.168.0.52:5999

So the patch I am sending fixes the problem by binding the received
connections to the appropriate service on the backup director, if it
exists, else the connection will be handled the old way. So if the
master and the backup directors are synchronized in terms of real
services there will be no problem with server over-committing since
new connections will not be created on the nonexistent real services
on the backup. With this patch the inherited connections will show as
inactive on the backup:

root[at]test2:~# ipvsadm -l
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP test2.local:5999 wlc
-> node473.local:5999 Route 1000 0 1
-> node484.local:5999 Route 1000 0 1


The patch is based on kernel 2.6.22.10, but patches 2.6.23 also (with
two hunks, but it is not tested on 2.6.23).

The result for 2.6.23.1:
patching file net/ipv4/ipvs/ip_vs_ctl.c
Hunk #2 succeeded at 543 (offset -1 lines).
Hunk #3 succeeded at 580 (offset -1 lines).


Regards,
Rumen Bogdanovski

Linux Virtual Server users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.