Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux Virtual Server: Users

Problems getting LVS to work

 

 

Linux Virtual Server users RSS feed   Index | Next | Previous | View Threaded


mark.wadham at areti

Mar 29, 2007, 3:22 AM

Post #1 of 10 (763 views)
Permalink
Problems getting LVS to work

Hi

I have been trying for a few weeks now to make LVS work, but I think I
am missing something fundamental. I'll try to go over the setup as
briefly as possible..

Please note: IP addresses have been obscured for security reasons. The
real addresses are routable.

We have a load balancer (with the lvs kernel stuff) at 100.1.1.1, with a
second IP address 100.1.1.2.

We have two mail servers at 120.1.1.1 and 120.1.1.2. The load balancer
is supposed to balance connections between the two mailservers. We have
another load balancer at 130.1.1.1 which works fine, but the new load
balancer is set up seemingly the same and yet it just does not work.

Load balancer configuration (100.1.1.2)
===========================
net.ipv4.conf.all.forwarding = 1

eth0 has 100.1.1.1, eth0:0 has 100.1.1.2

# ipvsadm --list -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 100.1.1.2:25 wlc
-> 120.1.1.1:25 Tunnel 1 0 0
-> 120.1.1.2:25 Tunnel 1 0 0

iptables has no rules and is default-to-accept. There is no firewall in
front of the box.

Mail server 1 (120.1.1.1)
=================

relevant iptables rules:

$IPTABLES -A INPUT -i eth0 -s 100.1.1.2 -p ipencap -j ACCEPT
$IPTABLES -A INPUT -i tunl0 -p tcp --dport smtp -j ACCEPT

Mail daemon listening on all IPs:

# netstat -natp |grep TEN |grep 25
tcp 0 0 0.0.0.0:25 0.0.0.0:*
LISTEN 14505/exim4

tunl0:0 is the tunnel interface for the existing load balancer (that works)

# ifconfig tunl0:0
tunl0:0 Link encap:IPIP Tunnel HWaddr
inet addr:130.1.1.2 Mask:255.255.255.255
UP RUNNING NOARP MTU:1480 Metric:1

tunl0:3 is the tunnel interface for the new load balancer that doesn't work

# ifconfig tunl0:3
tunl0:3 Link encap:IPIP Tunnel HWaddr
inet addr:100.1.1.2 Mask:255.255.255.255
UP RUNNING NOARP MTU:1480 Metric:1

Mail server 2 (120.1.1.2)
=================
Same as mailserver 1


The current load balancer at 130.1.1.1 uses 130.1.1.2 for load balancing
inbound smtp/25 connections to the two mailservers. If i telnet to
130.1.1.2 from my work machine at 140.1.1.1, this is the tcpdump sequence:

load 1

10:28:34.904382 IP 140.1.1.1.3948 > 130.1.1.2.25: S
3712043866:3712043866(0) win 65535 <mss 1260,nop,nop,sackOK>
10:28:34.909107 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 151923593 win 65535
10:28:55.134362 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 52 win 65484

mail 2 eth0 (this could be mail 1 here, in the example the connection
was passed to mail 2)

10:28:34.583491 IP 130.1.1.2.25 > 140.1.1.1.3948: S
151923592:151923592(0) ack 3712043867 win 5840 <mss 1460,nop,nop,sackOK>
10:28:54.608731 IP 130.1.1.2.25 > 140.1.1.1.3948: P 1:52(51) ack 1 win 5840

mail 2 tunl0

10:28:34.583459 IP 140.1.1.1.3948 > 130.1.1.2.25: S
3712043866:3712043866(0) win 65535 <mss 1260,nop,nop,sackOK>
10:28:34.588206 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 151923593 win 65535
10:28:54.813191 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 52 win 65484


This tcpdump shows a full tcp connection via the working load balancer
to one of the mail servers, in this case mail 2. The SMTP servers are
configured to pause for 20 seconds before showing their banner, which
accounts for the delay between the packets.


Now, if I try the same thing but telnet to 100.1.1.2:25 (the new load
balancer), the connection times out. tcpdumps show:

load balancer
=========
11:01:48.231327 IP 140.1.1.1.4042 > 100.1.1.2.25: S
1821058469:1821058469(0) win 65535 <mss 1260,nop,nop,sackOK>
11:01:51.195252 IP 140.1.1.1.4042 > 100.1.1.2.25: S
1821058469:1821058469(0) win 65535 <mss 1260,nop,nop,sackOK>
11:01:57.230423 IP 140.1.1.1.4042 > 100.1.1.2.25: S
1821058469:1821058469(0) win 65535 <mss 1260,nop,nop,sackOK>

Both of the mail servers show no traffic whatsoever on eth0 or the
tunnel interface.

On the load balancer, /proc/sys/net/ipv4/vs/debug_level is set to 9 and
the follow messages were observed in syslog:

Mar 29 11:01:48 dev1 kernel: IPVS: lookup/in TCP
140.1.1.1:4042->100.1.1.2:25 not hit
Mar 29 11:01:48 dev1 kernel: IPVS: lookup service: fwm 0 TCP
100.1.1.2:25 hit
Mar 29 11:01:48 dev1 kernel: IPVS: ip_vs_wlc_schedule(): Scheduling...
Mar 29 11:01:48 dev1 kernel: IPVS: WLC: server 120.1.1.1:25 activeconns
0 refcnt 1 weight 1 overhead 0
Mar 29 11:01:48 dev1 kernel: IPVS: Bind-dest TCP c:140.1.1.1:4042
v:100.1.1.2:25 d:120.1.1.1:25 fwd:T s:0 conn->flags:182 conn->refcnt:1
dest->refcnt:2
Mar 29 11:01:48 dev1 kernel: IPVS: Schedule fwd:T c:140.1.1.1:4042
v:100.1.1.2:25 d:120.1.1.1:25 conn->flags:1C2 conn->refcnt:2
Mar 29 11:01:48 dev1 kernel: IPVS: TCP input [S...]
120.1.1.1:25->140.1.1.1:4042 state: NONE->SYN_RECV conn->refcnt:2
Mar 29 11:01:51 dev1 kernel: IPVS: lookup/in TCP
140.1.1.1:4042->100.1.1.2:25 hit
Mar 29 11:01:57 dev1 kernel: IPVS: lookup/in TCP
140.1.1.1:4042->100.1.1.2:25 hit
Mar 29 11:02:04 dev1 kernel: IPVS: Unbind-dest TCP c:140.1.1.1:4039
v:100.1.1.2:25 d:120.1.1.2:25 fwd:T s:3 conn->flags:182 conn->refcnt:1
dest->refcnt:2


I really am at a loss as to why this doesn't work, the debug log seems
to show IPVS passing traffic to mail 1 (120.1.1.1) however the tcpdump
for that server shows absolutely nothing. If anyone can point me in the
right direction here I would be very grateful.

Thanks,

--
Mark Wadham
e: mark.wadham [at] areti t: +44 (0)20 8315 5800 f: +44 (0)20 8315 5801
Areti Internet Ltd., http://www.areti.net/

===================================================================
Areti Internet Ltd: BS EN ISO 9001:2000
Providing corporate Internet solutions for more than 10 years.
===================================================================

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://www.in-addr.de/mailman/listinfo/lvs-users


ratz at drugphish

Mar 29, 2007, 6:28 AM

Post #2 of 10 (735 views)
Permalink
Re: Problems getting LVS to work [In reply to]

Hi Mark,

Excellent problem report!

> We have a load balancer (with the lvs kernel stuff) at 100.1.1.1, with a
> second IP address 100.1.1.2.
>
> We have two mail servers at 120.1.1.1 and 120.1.1.2. The load balancer
> is supposed to balance connections between the two mailservers. We have
> another load balancer at 130.1.1.1 which works fine, but the new load
> balancer is set up seemingly the same and yet it just does not work.
>
> Load balancer configuration (100.1.1.2)
> ===========================
> net.ipv4.conf.all.forwarding = 1
>
> eth0 has 100.1.1.1, eth0:0 has 100.1.1.2

And their netmasks are 24, resp. 32?

> # ipvsadm --list -n
> IP Virtual Server version 1.2.1 (size=4096)
> Prot LocalAddress:Port Scheduler Flags
> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
> TCP 100.1.1.2:25 wlc
> -> 120.1.1.1:25 Tunnel 1 0 0
> -> 120.1.1.2:25 Tunnel 1 0 0
>
> iptables has no rules and is default-to-accept. There is no firewall in
> front of the box.
>
> Mail server 1 (120.1.1.1)
> =================
>
> relevant iptables rules:
>
> $IPTABLES -A INPUT -i eth0 -s 100.1.1.2 -p ipencap -j ACCEPT
> $IPTABLES -A INPUT -i tunl0 -p tcp --dport smtp -j ACCEPT

Why do you need those rules if you're not having any netfilter rules and
a ACCEPT policy?

> Mail daemon listening on all IPs:
>
> # netstat -natp |grep TEN |grep 25
> tcp 0 0 0.0.0.0:25 0.0.0.0:*
> LISTEN 14505/exim4

Excellent.

> tunl0:0 is the tunnel interface for the existing load balancer (that works)
>
> # ifconfig tunl0:0
> tunl0:0 Link encap:IPIP Tunnel HWaddr
> inet addr:130.1.1.2 Mask:255.255.255.255
> UP RUNNING NOARP MTU:1480 Metric:1
>
> tunl0:3 is the tunnel interface for the new load balancer that doesn't work
>
> # ifconfig tunl0:3
> tunl0:3 Link encap:IPIP Tunnel HWaddr
> inet addr:100.1.1.2 Mask:255.255.255.255
> UP RUNNING NOARP MTU:1480 Metric:1
>
> Mail server 2 (120.1.1.2)
> =================
> Same as mailserver 1
>
>
> The current load balancer at 130.1.1.1 uses 130.1.1.2 for load balancing
> inbound smtp/25 connections to the two mailservers. If i telnet to
> 130.1.1.2 from my work machine at 140.1.1.1, this is the tcpdump sequence:

I'm a bit confused by your obfuscation technique :), what's the
designation for the servers regarding the obfuscated IP ranges in
100.x.x.x, the 120.x.x.x, the 130.x.x.x and the 140.x.x.x?

140: your test machine
130: working LVS tunnel
120: RS (mail server)
100: new (non-functional) LVS tunnel

Is my observation correct?

> load 1
>
> 10:28:34.904382 IP 140.1.1.1.3948 > 130.1.1.2.25: S
> 3712043866:3712043866(0) win 65535 <mss 1260,nop,nop,sackOK>
> 10:28:34.909107 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 151923593 win 65535
> 10:28:55.134362 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 52 win 65484
>
> mail 2 eth0 (this could be mail 1 here, in the example the connection
> was passed to mail 2)
>
> 10:28:34.583491 IP 130.1.1.2.25 > 140.1.1.1.3948: S
> 151923592:151923592(0) ack 3712043867 win 5840 <mss 1460,nop,nop,sackOK>
> 10:28:54.608731 IP 130.1.1.2.25 > 140.1.1.1.3948: P 1:52(51) ack 1 win 5840
>
> mail 2 tunl0
>
> 10:28:34.583459 IP 140.1.1.1.3948 > 130.1.1.2.25: S
> 3712043866:3712043866(0) win 65535 <mss 1260,nop,nop,sackOK>
> 10:28:34.588206 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 151923593 win 65535
> 10:28:54.813191 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 52 win 65484
>
>
> This tcpdump shows a full tcp connection via the working load balancer
> to one of the mail servers, in this case mail 2. The SMTP servers are
> configured to pause for 20 seconds before showing their banner, which
> accounts for the delay between the packets.

So this works perfectly, as shown above, which actually indicates that
you have at one point got LVS to work. Sidenote: Your LVS seems to be a
bit out of sync regarding time; otherwise your trace looks odd.

> Now, if I try the same thing but telnet to 100.1.1.2:25 (the new load
> balancer), the connection times out. tcpdumps show:

Care to show the whole ipvsadm -L -n output? Or is the one above
representative enough to display the problem?

> load balancer
> =========
> 11:01:48.231327 IP 140.1.1.1.4042 > 100.1.1.2.25: S
> 1821058469:1821058469(0) win 65535 <mss 1260,nop,nop,sackOK>
> 11:01:51.195252 IP 140.1.1.1.4042 > 100.1.1.2.25: S
> 1821058469:1821058469(0) win 65535 <mss 1260,nop,nop,sackOK>
> 11:01:57.230423 IP 140.1.1.1.4042 > 100.1.1.2.25: S
> 1821058469:1821058469(0) win 65535 <mss 1260,nop,nop,sackOK>

Indicates a routing or network configuration issue.

> Both of the mail servers show no traffic whatsoever on eth0 or the
> tunnel interface.

Looks like the scheduler is not invoked or the packet does not match the
configuration.

> On the load balancer, /proc/sys/net/ipv4/vs/debug_level is set to 9 and
> the follow messages were observed in syslog:

Excellent:

> Mar 29 11:01:48 dev1 kernel: IPVS: lookup/in TCP
> 140.1.1.1:4042->100.1.1.2:25 not hit
> Mar 29 11:01:48 dev1 kernel: IPVS: lookup service: fwm 0 TCP
> 100.1.1.2:25 hit

Now this is very very weird. The normal TCP service lookup did not
succeed, although it should have, but the FWM TCP service lookup did.
Are you sure that:

a) You have cleanly shutdown (rmmod ip_vs if necessary) IPVS between
the functional and the non-functional test conduct?
b) You have no iptables or iproute2 rules indicating firewall marks?
c) You have no port 0 service set up?

> Mar 29 11:01:48 dev1 kernel: IPVS: ip_vs_wlc_schedule(): Scheduling...
> Mar 29 11:01:48 dev1 kernel: IPVS: WLC: server 120.1.1.1:25 activeconns
> 0 refcnt 1 weight 1 overhead 0
> Mar 29 11:01:48 dev1 kernel: IPVS: Bind-dest TCP c:140.1.1.1:4042
> v:100.1.1.2:25 d:120.1.1.1:25 fwd:T s:0 conn->flags:182 conn->refcnt:1
> dest->refcnt:2
> Mar 29 11:01:48 dev1 kernel: IPVS: Schedule fwd:T c:140.1.1.1:4042
> v:100.1.1.2:25 d:120.1.1.1:25 conn->flags:1C2 conn->refcnt:2

This looks like it would happily send it.

> Mar 29 11:01:48 dev1 kernel: IPVS: TCP input [S...]
> 120.1.1.1:25->140.1.1.1:4042 state: NONE->SYN_RECV conn->refcnt:2

Ok, we do the state transition indicating that we've allocated the
connection structure for the hash table entry.

> Mar 29 11:01:51 dev1 kernel: IPVS: lookup/in TCP
> 140.1.1.1:4042->100.1.1.2:25 hit

Second SYN as seen in your non-functional tcpdump trace.

> Mar 29 11:01:57 dev1 kernel: IPVS: lookup/in TCP
> 140.1.1.1:4042->100.1.1.2:25 hit

Third SYN as seen in your non-functional tcpdump trace.

> Mar 29 11:02:04 dev1 kernel: IPVS: Unbind-dest TCP c:140.1.1.1:4039
> v:100.1.1.2:25 d:120.1.1.2:25 fwd:T s:3 conn->flags:182 conn->refcnt:1
> dest->refcnt:2

This is not belonging to the trace above since it's port 4039 which must
have been a test performed before you took the trace. Most likely this
one ran into the normal 60 sec timeout.

> I really am at a loss as to why this doesn't work, the debug log seems
> to show IPVS passing traffic to mail 1 (120.1.1.1) however the tcpdump
> for that server shows absolutely nothing. If anyone can point me in the
> right direction here I would be very grateful.

Can you show your routing information on your LVS? As well as the tun*
device configuration in the proc-fs?

Best regards,
Roberto Nibali, ratz
--
echo
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc
_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://www.in-addr.de/mailman/listinfo/lvs-users


mark.wadham at areti

Mar 29, 2007, 6:51 AM

Post #3 of 10 (741 views)
Permalink
Re: Problems getting LVS to work [In reply to]

Hi Roberto,

Thanks for the quick response!

Roberto Nibali wrote:
> Hi Mark,
>
> Excellent problem report!
>
*takes a bow*
>> We have a load balancer (with the lvs kernel stuff) at 100.1.1.1,
>> with a second IP address 100.1.1.2.
>>
>> We have two mail servers at 120.1.1.1 and 120.1.1.2. The load
>> balancer is supposed to balance connections between the two
>> mailservers. We have another load balancer at 130.1.1.1 which works
>> fine, but the new load balancer is set up seemingly the same and yet
>> it just does not work.
>>
>> Load balancer configuration (100.1.1.2)
>> ===========================
>> net.ipv4.conf.all.forwarding = 1
>>
>> eth0 has 100.1.1.1, eth0:0 has 100.1.1.2
>
> And their netmasks are 24, resp. 32?
Yes
>
>> # ipvsadm --list -n
>> IP Virtual Server version 1.2.1 (size=4096)
>> Prot LocalAddress:Port Scheduler Flags
>> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
>> TCP 100.1.1.2:25 wlc
>> -> 120.1.1.1:25 Tunnel 1 0 0
>> -> 120.1.1.2:25 Tunnel 1 0 0
>>
>> iptables has no rules and is default-to-accept. There is no firewall
>> in front of the box.
>>
>> Mail server 1 (120.1.1.1)
>> =================
>>
>> relevant iptables rules:
>>
>> $IPTABLES -A INPUT -i eth0 -s 100.1.1.2 -p ipencap -j ACCEPT
>> $IPTABLES -A INPUT -i tunl0 -p tcp --dport smtp -j ACCEPT
>
> Why do you need those rules if you're not having any netfilter rules
> and a ACCEPT policy?
>
The mailservers _do_ have firewall rules, its just the new load balancer
that does not. However, I don't think this is a firewall issue as
dropped packets still show up in tcpdump, and also I am able to telnet
directly to port 25 on both mailservers from the new (broken) load balancer.
>> Mail daemon listening on all IPs:
>>
>> # netstat -natp |grep TEN |grep 25
>> tcp 0 0 0.0.0.0:25 0.0.0.0:*
>> LISTEN 14505/exim4
>
> Excellent.
>
>> tunl0:0 is the tunnel interface for the existing load balancer (that
>> works)
>>
>> # ifconfig tunl0:0
>> tunl0:0 Link encap:IPIP Tunnel HWaddr
>> inet addr:130.1.1.2 Mask:255.255.255.255
>> UP RUNNING NOARP MTU:1480 Metric:1
>>
>> tunl0:3 is the tunnel interface for the new load balancer that
>> doesn't work
>>
>> # ifconfig tunl0:3
>> tunl0:3 Link encap:IPIP Tunnel HWaddr
>> inet addr:100.1.1.2 Mask:255.255.255.255
>> UP RUNNING NOARP MTU:1480 Metric:1
>>
>> Mail server 2 (120.1.1.2)
>> =================
>> Same as mailserver 1
>>
>>
>> The current load balancer at 130.1.1.1 uses 130.1.1.2 for load
>> balancing inbound smtp/25 connections to the two mailservers. If i
>> telnet to 130.1.1.2 from my work machine at 140.1.1.1, this is the
>> tcpdump sequence:
>
> I'm a bit confused by your obfuscation technique :), what's the
> designation for the servers regarding the obfuscated IP ranges in
> 100.x.x.x, the 120.x.x.x, the 130.x.x.x and the 140.x.x.x?
>
> 140: your test machine
> 130: working LVS tunnel
> 120: RS (mail server)
> 100: new (non-functional) LVS tunnel
>
> Is my observation correct?
>
Yes, sorry for the obfuscation - I was all for just pasting the real IPs
but my manager refused to let me ;)
>> load 1
>>
>> 10:28:34.904382 IP 140.1.1.1.3948 > 130.1.1.2.25: S
>> 3712043866:3712043866(0) win 65535 <mss 1260,nop,nop,sackOK>
>> 10:28:34.909107 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 151923593 win
>> 65535
>> 10:28:55.134362 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 52 win 65484
>>
>> mail 2 eth0 (this could be mail 1 here, in the example the
>> connection was passed to mail 2)
>>
>> 10:28:34.583491 IP 130.1.1.2.25 > 140.1.1.1.3948: S
>> 151923592:151923592(0) ack 3712043867 win 5840 <mss 1460,nop,nop,sackOK>
>> 10:28:54.608731 IP 130.1.1.2.25 > 140.1.1.1.3948: P 1:52(51) ack 1
>> win 5840
>>
>> mail 2 tunl0
>>
>> 10:28:34.583459 IP 140.1.1.1.3948 > 130.1.1.2.25: S
>> 3712043866:3712043866(0) win 65535 <mss 1260,nop,nop,sackOK>
>> 10:28:34.588206 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 151923593 win
>> 65535
>> 10:28:54.813191 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 52 win 65484
>>
>>
>> This tcpdump shows a full tcp connection via the working load
>> balancer to one of the mail servers, in this case mail 2. The SMTP
>> servers are configured to pause for 20 seconds before showing their
>> banner, which accounts for the delay between the packets.
>
> So this works perfectly, as shown above, which actually indicates that
> you have at one point got LVS to work. Sidenote: Your LVS seems to be
> a bit out of sync regarding time; otherwise your trace looks odd.
>
Yes, it was actually someone else who got it working before, and he is
far too busy to assist me with the new one :)
>> Now, if I try the same thing but telnet to 100.1.1.2:25 (the new load
>> balancer), the connection times out. tcpdumps show:
>
> Care to show the whole ipvsadm -L -n output? Or is the one above
> representative enough to display the problem?
>
Didn't I paste this above? --list is the same as -L I believe, at least
the output is no different..
>> load balancer
>> =========
>> 11:01:48.231327 IP 140.1.1.1.4042 > 100.1.1.2.25: S
>> 1821058469:1821058469(0) win 65535 <mss 1260,nop,nop,sackOK>
>> 11:01:51.195252 IP 140.1.1.1.4042 > 100.1.1.2.25: S
>> 1821058469:1821058469(0) win 65535 <mss 1260,nop,nop,sackOK>
>> 11:01:57.230423 IP 140.1.1.1.4042 > 100.1.1.2.25: S
>> 1821058469:1821058469(0) win 65535 <mss 1260,nop,nop,sackOK>
>
> Indicates a routing or network configuration issue.
>
>> Both of the mail servers show no traffic whatsoever on eth0 or the
>> tunnel interface.
>
> Looks like the scheduler is not invoked or the packet does not match
> the configuration.
>
>> On the load balancer, /proc/sys/net/ipv4/vs/debug_level is set to 9
>> and the follow messages were observed in syslog:
>
> Excellent:
>
>> Mar 29 11:01:48 dev1 kernel: IPVS: lookup/in TCP
>> 140.1.1.1:4042->100.1.1.2:25 not hit
>> Mar 29 11:01:48 dev1 kernel: IPVS: lookup service: fwm 0 TCP
>> 100.1.1.2:25 hit
>
> Now this is very very weird. The normal TCP service lookup did not
> succeed, although it should have, but the FWM TCP service lookup did.
> Are you sure that:
>
> a) You have cleanly shutdown (rmmod ip_vs if necessary) IPVS between
> the functional and the non-functional test conduct?
ipvs is compiled statically into the kernel, so how would I shut it
down? I had no idea it was necessary to shut it down and bring it back
up, although I have rebooted the server a couple of times which I am
sure would accomplish the same effect.
> b) You have no iptables or iproute2 rules indicating firewall marks?

# iptables --list
Chain INPUT (policy ACCEPT)
target prot opt source destination

Chain FORWARD (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

# iproute2
bash: iproute2: command not found

I built this server myself and never did anything with iproute2.. so
I'm guessing the answer is no. Although I do believe Debian is evil and
so I guess it could have possibly done this itself behind my back.

> c) You have no port 0 service set up?
Definitely not
>
>> Mar 29 11:01:48 dev1 kernel: IPVS: ip_vs_wlc_schedule(): Scheduling...
>> Mar 29 11:01:48 dev1 kernel: IPVS: WLC: server 120.1.1.1:25
>> activeconns 0 refcnt 1 weight 1 overhead 0
>> Mar 29 11:01:48 dev1 kernel: IPVS: Bind-dest TCP c:140.1.1.1:4042
>> v:100.1.1.2:25 d:120.1.1.1:25 fwd:T s:0 conn->flags:182
>> conn->refcnt:1 dest->refcnt:2
>> Mar 29 11:01:48 dev1 kernel: IPVS: Schedule fwd:T c:140.1.1.1:4042
>> v:100.1.1.2:25 d:120.1.1.1:25 conn->flags:1C2 conn->refcnt:2
>
> This looks like it would happily send it.
>
>> Mar 29 11:01:48 dev1 kernel: IPVS: TCP input [S...]
>> 120.1.1.1:25->140.1.1.1:4042 state: NONE->SYN_RECV conn->refcnt:2
>
> Ok, we do the state transition indicating that we've allocated the
> connection structure for the hash table entry.
>
>> Mar 29 11:01:51 dev1 kernel: IPVS: lookup/in TCP
>> 140.1.1.1:4042->100.1.1.2:25 hit
>
> Second SYN as seen in your non-functional tcpdump trace.
>
>> Mar 29 11:01:57 dev1 kernel: IPVS: lookup/in TCP
>> 140.1.1.1:4042->100.1.1.2:25 hit
>
> Third SYN as seen in your non-functional tcpdump trace.
>
>> Mar 29 11:02:04 dev1 kernel: IPVS: Unbind-dest TCP c:140.1.1.1:4039
>> v:100.1.1.2:25 d:120.1.1.2:25 fwd:T s:3 conn->flags:182
>> conn->refcnt:1 dest->refcnt:2
>
> This is not belonging to the trace above since it's port 4039 which
> must have been a test performed before you took the trace. Most likely
> this one ran into the normal 60 sec timeout.
>
>> I really am at a loss as to why this doesn't work, the debug log
>> seems to show IPVS passing traffic to mail 1 (120.1.1.1) however the
>> tcpdump for that server shows absolutely nothing. If anyone can
>> point me in the right direction here I would be very grateful.
>
> Can you show your routing information on your LVS? As well as the tun*
> device configuration in the proc-fs?
>
Sure, by LVS i'm going to assume you mean the broken load balancer.

# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use
Iface
100.1.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
0.0.0.0 100.1.1.254 0.0.0.0 UG 0 0 0 eth0

# find /proc |grep tun
#

This is odd, tunl0 does exist:

# ifconfig tunl0
tunl0 Link encap:IPIP Tunnel HWaddr
NOARP MTU:1480 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)


Don't know why its absent from /proc.


Thanks again for your assistance,

Mark

--
Mark Wadham
e: mark.wadham [at] areti t: +44 (0)20 8315 5800 f: +44 (0)20 8315 5801
Areti Internet Ltd., http://www.areti.net/

===================================================================
Areti Internet Ltd: BS EN ISO 9001:2000
Providing corporate Internet solutions for more than 10 years.
===================================================================

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://www.in-addr.de/mailman/listinfo/lvs-users


ratz at drugphish

Mar 29, 2007, 7:37 AM

Post #4 of 10 (730 views)
Permalink
Re: Problems getting LVS to work [In reply to]

Hi Mark,

>> Excellent problem report!
>>
> *takes a bow*


I here by dub thee once ... I dub thee twice ... I dub thee Sir LVS Bug
Reporter, you may rise and go forth. Will you accept from Us this honor,
and will you swear fealty to this, Our order of LVS?

>>> # ipvsadm --list -n
>>> IP Virtual Server version 1.2.1 (size=4096)
>>> Prot LocalAddress:Port Scheduler Flags
>>> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
>>> TCP 100.1.1.2:25 wlc
>>> -> 120.1.1.1:25 Tunnel 1 0 0
>>> -> 120.1.1.2:25 Tunnel 1 0 0
>>>
>>> iptables has no rules and is default-to-accept. There is no firewall
>>> in front of the box.
>>>
>>> Mail server 1 (120.1.1.1)
>>> =================
>>>
>>> relevant iptables rules:
>>>
>>> $IPTABLES -A INPUT -i eth0 -s 100.1.1.2 -p ipencap -j ACCEPT
>>> $IPTABLES -A INPUT -i tunl0 -p tcp --dport smtp -j ACCEPT
>>
>> Why do you need those rules if you're not having any netfilter rules
>> and a ACCEPT policy?
>>
> The mailservers _do_ have firewall rules, its just the new load balancer
> that does not. However, I don't think this is a firewall issue as
> dropped packets still show up in tcpdump, and also I am able to telnet
> directly to port 25 on both mailservers from the new (broken) load
> balancer.

Not necessarily but this is hopefully not hitting you. Depending on the
kernel, netfilter in the PREROUTING table handling could drop the skb
before tcpdump would get a skb->clone() of it.

>> I'm a bit confused by your obfuscation technique :), what's the
>> designation for the servers regarding the obfuscated IP ranges in
>> 100.x.x.x, the 120.x.x.x, the 130.x.x.x and the 140.x.x.x?
>>
>> 140: your test machine
>> 130: working LVS tunnel
>> 120: RS (mail server)
>> 100: new (non-functional) LVS tunnel
>>
>> Is my observation correct?
>>
> Yes, sorry for the obfuscation - I was all for just pasting the real IPs
> but my manager refused to let me ;)

That's very noble of him.

>> So this works perfectly, as shown above, which actually indicates that
>> you have at one point got LVS to work. Sidenote: Your LVS seems to be
>> a bit out of sync regarding time; otherwise your trace looks odd.
>>
> Yes, it was actually someone else who got it working before, and he is
> far too busy to assist me with the new one :)

This is the part where your manager should probably call him back :).

>>> Now, if I try the same thing but telnet to 100.1.1.2:25 (the new load
>>> balancer), the connection times out. tcpdumps show:
>>
>> Care to show the whole ipvsadm -L -n output? Or is the one above
>> representative enough to display the problem?
>>
> Didn't I paste this above? --list is the same as -L I believe, at least
> the output is no different..

Sure, but there was no indication to which state of your test conducts
your quoted output pertained to. When you say "the new load balancer"
above, you do not mean a physically different machine to the "old load
balancer", do you?

>>> Mar 29 11:01:48 dev1 kernel: IPVS: lookup/in TCP
>>> 140.1.1.1:4042->100.1.1.2:25 not hit
>>> Mar 29 11:01:48 dev1 kernel: IPVS: lookup service: fwm 0 TCP
>>> 100.1.1.2:25 hit
>>
>> Now this is very very weird. The normal TCP service lookup did not
>> succeed, although it should have, but the FWM TCP service lookup did.
>> Are you sure that:
>>
>> a) You have cleanly shutdown (rmmod ip_vs if necessary) IPVS between
>> the functional and the non-functional test conduct?
> ipvs is compiled statically into the kernel, so how would I shut it
> down? I had no idea it was necessary to shut it down and bring it back
> up, although I have rebooted the server a couple of times which I am
> sure would accomplish the same effect.

Absolutely. The point is that the template entries are not flushed when
you simply remove the destination servers from the kernel, only detached.

>> b) You have no iptables or iproute2 rules indicating firewall marks?
>
> # iptables --list
> Chain INPUT (policy ACCEPT)
> target prot opt source destination
>
> Chain FORWARD (policy ACCEPT)
> target prot opt source destination
>
> Chain OUTPUT (policy ACCEPT)
> target prot opt source destination

That's not all :). You've only shown the filter table, but I'm also
interested in the mangle table.

> # iproute2
> bash: iproute2: command not found

It's the ip command output from the iproute2 framework I was looking for.

This is the successor to ifconfig and route and netstat and whatnot. The
Linux world decided at one point in its history (around 1999) that
ifconfig/route/other networking setup tools are not appropriate anymore
and replaced them with the iproute2 framework. Unfortunately the guy who
started all this is a bloody genius and as such did two things: a)
completely forgot to document it, b) never told anyone outside the
kernel community about this, for years. So, if you find time, invoke
"man ip" on a recent enough Linux distribution of your choice.

> I built this server myself and never did anything with iproute2.. so
> I'm guessing the answer is no. Although I do believe Debian is evil and
> so I guess it could have possibly done this itself behind my back.

Debian people hopefully do not have evil intentions, however could pass
along the output of:

ip rule show
ip route show
ip link show
ip addr show
grep -r . /proc/sys/net/ipv4/conf/*

>> c) You have no port 0 service set up?
> Definitely not

I see. Not! :)

>>> Mar 29 11:01:48 dev1 kernel: IPVS: ip_vs_wlc_schedule(): Scheduling...
>>> Mar 29 11:01:48 dev1 kernel: IPVS: WLC: server 120.1.1.1:25
>>> activeconns 0 refcnt 1 weight 1 overhead 0
>>> Mar 29 11:01:48 dev1 kernel: IPVS: Bind-dest TCP c:140.1.1.1:4042
>>> v:100.1.1.2:25 d:120.1.1.1:25 fwd:T s:0 conn->flags:182
>>> conn->refcnt:1 dest->refcnt:2
>>> Mar 29 11:01:48 dev1 kernel: IPVS: Schedule fwd:T c:140.1.1.1:4042
>>> v:100.1.1.2:25 d:120.1.1.1:25 conn->flags:1C2 conn->refcnt:2
>>
>> This looks like it would happily send it.
>>
>>> Mar 29 11:01:48 dev1 kernel: IPVS: TCP input [S...]
>>> 120.1.1.1:25->140.1.1.1:4042 state: NONE->SYN_RECV conn->refcnt:2
>>
>> Ok, we do the state transition indicating that we've allocated the
>> connection structure for the hash table entry.
>>
>>> Mar 29 11:01:51 dev1 kernel: IPVS: lookup/in TCP
>>> 140.1.1.1:4042->100.1.1.2:25 hit
>>
>> Second SYN as seen in your non-functional tcpdump trace.
>>
>>> Mar 29 11:01:57 dev1 kernel: IPVS: lookup/in TCP
>>> 140.1.1.1:4042->100.1.1.2:25 hit
>>
>> Third SYN as seen in your non-functional tcpdump trace.
>>
>>> Mar 29 11:02:04 dev1 kernel: IPVS: Unbind-dest TCP c:140.1.1.1:4039
>>> v:100.1.1.2:25 d:120.1.1.2:25 fwd:T s:3 conn->flags:182
>>> conn->refcnt:1 dest->refcnt:2
>>
>> This is not belonging to the trace above since it's port 4039 which
>> must have been a test performed before you took the trace. Most likely
>> this one ran into the normal 60 sec timeout.
>>
>>> I really am at a loss as to why this doesn't work, the debug log
>>> seems to show IPVS passing traffic to mail 1 (120.1.1.1) however the
>>> tcpdump for that server shows absolutely nothing. If anyone can
>>> point me in the right direction here I would be very grateful.
>>
>> Can you show your routing information on your LVS? As well as the tun*
>> device configuration in the proc-fs?
>>
> Sure, by LVS i'm going to assume you mean the broken load balancer.
>
> # route -n
> Kernel IP routing table
> Destination Gateway Genmask Flags Metric Ref Use
> Iface
> 100.1.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
> 0.0.0.0 100.1.1.254 0.0.0.0 UG 0 0 0
> eth0

Could you please send me the iproute2 related output, as indicated
above? route -n does not show all the routing entries on a Linux box.

> # find /proc |grep tun

Sidenote: You might not call that command like that too often on your
productive server. I've seen nasty kernel OOPS more than once after such
a stat()-intensive command.

> This is odd, tunl0 does exist:
>
> # ifconfig tunl0
> tunl0 Link encap:IPIP Tunnel HWaddr
> NOARP MTU:1480 Metric:1
> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

Sure, but it's not activated. Could you by any chance call following
command on your box?

ip link set dev tunl0 up

> Don't know why its absent from /proc.

Since there are no IFF_RUNNING|IFF_UP flags set, there's no point in
setting any entries for this virtual device in the proc-fs.

> Thanks again for your assistance,

Always when receiving such nice bug reports,
Roberto Nibali, ratz
--
echo
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc
_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://www.in-addr.de/mailman/listinfo/lvs-users


sebvieira at gmail

Mar 29, 2007, 7:42 AM

Post #5 of 10 (721 views)
Permalink
Re: Problems getting LVS to work [In reply to]

On 3/29/07, Roberto Nibali <ratz [at] drugphish> wrote:
>
>
> I here by dub thee once ... I dub thee twice ... I dub thee Sir LVS Bug
> Reporter, you may rise and go forth. Will you accept from Us this honor,
> and will you swear fealty to this, Our order of LVS?



Without knowing what floats on water? How easily one gets knighted nowadays
...


S.
_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://www.in-addr.de/mailman/listinfo/lvs-users


ratz at drugphish

Mar 29, 2007, 7:56 AM

Post #6 of 10 (725 views)
Permalink
Re: Problems getting LVS to work [In reply to]

>> I here by dub thee once ... I dub thee twice ... I dub thee Sir LVS Bug
>> Reporter, you may rise and go forth. Will you accept from Us this honor,
>> and will you swear fealty to this, Our order of LVS?
>
> Without knowing what floats on water? How easily one gets knighted nowadays

Darn, even wrongly so: s/Bug Reporter/Problem Reporter/

Cheers,
Roberto Nibali, ratz
--
echo
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc
_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://www.in-addr.de/mailman/listinfo/lvs-users


mark.wadham at areti

Mar 29, 2007, 7:57 AM

Post #7 of 10 (737 views)
Permalink
Re: Problems getting LVS to work [In reply to]

Roberto Nibali wrote:
> Hi Mark,
>
>>> Excellent problem report!
>>>
>> *takes a bow*
>
>
> I here by dub thee once ... I dub thee twice ... I dub thee Sir LVS
> Bug Reporter, you may rise and go forth. Will you accept from Us this
> honor,
> and will you swear fealty to this, Our order of LVS?
>
Yes yes, thanks :D
>>>> # ipvsadm --list -n
>>>> IP Virtual Server version 1.2.1 (size=4096)
>>>> Prot LocalAddress:Port Scheduler Flags
>>>> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
>>>> TCP 100.1.1.2:25 wlc
>>>> -> 120.1.1.1:25 Tunnel 1 0 0
>>>> -> 120.1.1.2:25 Tunnel 1 0 0
>>>>
>>>> iptables has no rules and is default-to-accept. There is no
>>>> firewall in front of the box.
>>>>
>>>> Mail server 1 (120.1.1.1)
>>>> =================
>>>>
>>>> relevant iptables rules:
>>>>
>>>> $IPTABLES -A INPUT -i eth0 -s 100.1.1.2 -p ipencap -j ACCEPT
>>>> $IPTABLES -A INPUT -i tunl0 -p tcp --dport smtp -j ACCEPT
>>>
>>> Why do you need those rules if you're not having any netfilter rules
>>> and a ACCEPT policy?
>>>
>> The mailservers _do_ have firewall rules, its just the new load
>> balancer that does not. However, I don't think this is a firewall
>> issue as dropped packets still show up in tcpdump, and also I am able
>> to telnet directly to port 25 on both mailservers from the new
>> (broken) load balancer.
>
> Not necessarily but this is hopefully not hitting you. Depending on
> the kernel, netfilter in the PREROUTING table handling could drop the
> skb before tcpdump would get a skb->clone() of it.
>
>>> I'm a bit confused by your obfuscation technique :), what's the
>>> designation for the servers regarding the obfuscated IP ranges in
>>> 100.x.x.x, the 120.x.x.x, the 130.x.x.x and the 140.x.x.x?
>>>
>>> 140: your test machine
>>> 130: working LVS tunnel
>>> 120: RS (mail server)
>>> 100: new (non-functional) LVS tunnel
>>>
>>> Is my observation correct?
>>>
>> Yes, sorry for the obfuscation - I was all for just pasting the real
>> IPs but my manager refused to let me ;)
>
> That's very noble of him.
>
>>> So this works perfectly, as shown above, which actually indicates
>>> that you have at one point got LVS to work. Sidenote: Your LVS seems
>>> to be a bit out of sync regarding time; otherwise your trace looks odd.
>>>
>> Yes, it was actually someone else who got it working before, and he
>> is far too busy to assist me with the new one :)
>
> This is the part where your manager should probably call him back :).
>
It was actually the manager himself who set up the first one :)
>>>> Now, if I try the same thing but telnet to 100.1.1.2:25 (the new
>>>> load balancer), the connection times out. tcpdumps show:
>>>
>>> Care to show the whole ipvsadm -L -n output? Or is the one above
>>> representative enough to display the problem?
>>>
>> Didn't I paste this above? --list is the same as -L I believe, at
>> least the output is no different..
>
> Sure, but there was no indication to which state of your test conducts
> your quoted output pertained to. When you say "the new load balancer"
> above, you do not mean a physically different machine to the "old load
> balancer", do you?
>
There are two load balancers, the 'old' one which works and the 'new'
one which doesn't. Here is the ipvsadm output for the new, broken load
balancer:

# ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 100.1.1.2:25 wlc
-> 120.1.1.1:25 Tunnel 1 0 0
-> 120.1.1.2:25 Tunnel 1 0 0

>>>> Mar 29 11:01:48 dev1 kernel: IPVS: lookup/in TCP
>>>> 140.1.1.1:4042->100.1.1.2:25 not hit
>>>> Mar 29 11:01:48 dev1 kernel: IPVS: lookup service: fwm 0 TCP
>>>> 100.1.1.2:25 hit
>>>
>>> Now this is very very weird. The normal TCP service lookup did not
>>> succeed, although it should have, but the FWM TCP service lookup
>>> did. Are you sure that:
>>>
>>> a) You have cleanly shutdown (rmmod ip_vs if necessary) IPVS between
>>> the functional and the non-functional test conduct?
>> ipvs is compiled statically into the kernel, so how would I shut it
>> down? I had no idea it was necessary to shut it down and bring it
>> back up, although I have rebooted the server a couple of times which
>> I am sure would accomplish the same effect.
>
> Absolutely. The point is that the template entries are not flushed
> when you simply remove the destination servers from the kernel, only
> detached.
>
>>> b) You have no iptables or iproute2 rules indicating firewall marks?
>>
>> # iptables --list
>> Chain INPUT (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain FORWARD (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain OUTPUT (policy ACCEPT)
>> target prot opt source destination
>
> That's not all :). You've only shown the filter table, but I'm also
> interested in the mangle table.
>
# iptables -t mangle --list
Chain PREROUTING (policy ACCEPT)
target prot opt source destination

Chain INPUT (policy ACCEPT)
target prot opt source destination

Chain FORWARD (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

Chain POSTROUTING (policy ACCEPT)
target prot opt source destination

# iptables -t nat --list
Chain PREROUTING (policy ACCEPT)
target prot opt source destination

Chain POSTROUTING (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

>> # iproute2
>> bash: iproute2: command not found
>
> It's the ip command output from the iproute2 framework I was looking for.
>
> This is the successor to ifconfig and route and netstat and whatnot.
> The Linux world decided at one point in its history (around 1999) that
> ifconfig/route/other networking setup tools are not appropriate
> anymore and replaced them with the iproute2 framework. Unfortunately
> the guy who started all this is a bloody genius and as such did two
> things: a) completely forgot to document it, b) never told anyone
> outside the kernel community about this, for years. So, if you find
> time, invoke "man ip" on a recent enough Linux distribution of your
> choice.
>
LOL
>> I built this server myself and never did anything with iproute2.. so
>> I'm guessing the answer is no. Although I do believe Debian is evil
>> and so I guess it could have possibly done this itself behind my back.
>
> Debian people hopefully do not have evil intentions, however could
> pass along the output of:
>
> ip rule show
> ip route show
> ip link show
> ip addr show
> grep -r . /proc/sys/net/ipv4/conf/*
>
# ip rule show
0: from all lookup 255
32766: from all lookup main
32767: from all lookup default
# ip route show
100.1.1.0/24 dev eth0 proto kernel scope link src 100.1.1.1
default via 85.158.56.1 dev eth0
# ip link show
1: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
2: plip0: <POINTOPOINT,NOARP> mtu 1500 qdisc noop qlen 10
link/ether fc:fc:fc:fc:fc:fc peer ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 00:04:76:16:12:a5 brd ff:ff:ff:ff:ff:ff
4: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:b0:d0:68:7f:2b brd ff:ff:ff:ff:ff:ff
5: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
6: shaper0: <> mtu 1500 qdisc noop qlen 10
link/ether
7: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop
link/ether b6:e6:25:ed:c6:2d brd ff:ff:ff:ff:ff:ff
8: eql: <MASTER> mtu 576 qdisc noop qlen 5
link/slip
9: teql0: <NOARP> mtu 1500 qdisc noop qlen 100
link/void
10: tunl0: <NOARP> mtu 1480 qdisc noop
link/ipip 0.0.0.0 brd 0.0.0.0
11: gre0: <NOARP> mtu 1476 qdisc noop
link/gre 0.0.0.0 brd 0.0.0.0
# ip addr show
1: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
2: plip0: <POINTOPOINT,NOARP> mtu 1500 qdisc noop qlen 10
link/ether fc:fc:fc:fc:fc:fc peer ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 00:04:76:16:12:a5 brd ff:ff:ff:ff:ff:ff
4: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:b0:d0:68:7f:2b brd ff:ff:ff:ff:ff:ff
inet 100.1.1.1/24 brd 100.1.1.255 scope global eth0
inet 100.1.1.2/24 brd 100.1.1.255 scope global secondary eth0:0
5: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
6: shaper0: <> mtu 1500 qdisc noop qlen 10
link/ether
7: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop
link/ether b6:e6:25:ed:c6:2d brd ff:ff:ff:ff:ff:ff
8: eql: <MASTER> mtu 576 qdisc noop qlen 5
link/slip
9: teql0: <NOARP> mtu 1500 qdisc noop qlen 100
link/void
10: tunl0: <NOARP> mtu 1480 qdisc noop
link/ipip 0.0.0.0 brd 0.0.0.0
11: gre0: <NOARP> mtu 1476 qdisc noop
link/gre 0.0.0.0 brd 0.0.0.0
# grep -r . /proc/sys/net/ipv4/conf/*
/proc/sys/net/ipv4/conf/all/promote_secondaries:0
/proc/sys/net/ipv4/conf/all/force_igmp_version:0
/proc/sys/net/ipv4/conf/all/disable_policy:0
/proc/sys/net/ipv4/conf/all/disable_xfrm:0
/proc/sys/net/ipv4/conf/all/arp_accept:0
/proc/sys/net/ipv4/conf/all/arp_ignore:0
/proc/sys/net/ipv4/conf/all/arp_announce:0
/proc/sys/net/ipv4/conf/all/arp_filter:0
/proc/sys/net/ipv4/conf/all/tag:0
/proc/sys/net/ipv4/conf/all/log_martians:0
/proc/sys/net/ipv4/conf/all/bootp_relay:0
/proc/sys/net/ipv4/conf/all/medium_id:0
/proc/sys/net/ipv4/conf/all/proxy_arp:0
/proc/sys/net/ipv4/conf/all/accept_source_route:0
/proc/sys/net/ipv4/conf/all/send_redirects:1
/proc/sys/net/ipv4/conf/all/rp_filter:0
/proc/sys/net/ipv4/conf/all/shared_media:1
/proc/sys/net/ipv4/conf/all/secure_redirects:1
/proc/sys/net/ipv4/conf/all/accept_redirects:0
/proc/sys/net/ipv4/conf/all/mc_forwarding:0
/proc/sys/net/ipv4/conf/all/forwarding:1
/proc/sys/net/ipv4/conf/default/promote_secondaries:0
/proc/sys/net/ipv4/conf/default/force_igmp_version:0
/proc/sys/net/ipv4/conf/default/disable_policy:0
/proc/sys/net/ipv4/conf/default/disable_xfrm:0
/proc/sys/net/ipv4/conf/default/arp_accept:0
/proc/sys/net/ipv4/conf/default/arp_ignore:0
/proc/sys/net/ipv4/conf/default/arp_announce:0
/proc/sys/net/ipv4/conf/default/arp_filter:0
/proc/sys/net/ipv4/conf/default/tag:0
/proc/sys/net/ipv4/conf/default/log_martians:0
/proc/sys/net/ipv4/conf/default/bootp_relay:0
/proc/sys/net/ipv4/conf/default/medium_id:0
/proc/sys/net/ipv4/conf/default/proxy_arp:0
/proc/sys/net/ipv4/conf/default/accept_source_route:1
/proc/sys/net/ipv4/conf/default/send_redirects:1
/proc/sys/net/ipv4/conf/default/rp_filter:0
/proc/sys/net/ipv4/conf/default/shared_media:1
/proc/sys/net/ipv4/conf/default/secure_redirects:1
/proc/sys/net/ipv4/conf/default/accept_redirects:1
/proc/sys/net/ipv4/conf/default/mc_forwarding:0
/proc/sys/net/ipv4/conf/default/forwarding:1
/proc/sys/net/ipv4/conf/eth0/promote_secondaries:0
/proc/sys/net/ipv4/conf/eth0/force_igmp_version:0
/proc/sys/net/ipv4/conf/eth0/disable_policy:0
/proc/sys/net/ipv4/conf/eth0/disable_xfrm:0
/proc/sys/net/ipv4/conf/eth0/arp_accept:0
/proc/sys/net/ipv4/conf/eth0/arp_ignore:0
/proc/sys/net/ipv4/conf/eth0/arp_announce:0
/proc/sys/net/ipv4/conf/eth0/arp_filter:0
/proc/sys/net/ipv4/conf/eth0/tag:0
/proc/sys/net/ipv4/conf/eth0/log_martians:0
/proc/sys/net/ipv4/conf/eth0/bootp_relay:0
/proc/sys/net/ipv4/conf/eth0/medium_id:0
/proc/sys/net/ipv4/conf/eth0/proxy_arp:0
/proc/sys/net/ipv4/conf/eth0/accept_source_route:1
/proc/sys/net/ipv4/conf/eth0/send_redirects:1
/proc/sys/net/ipv4/conf/eth0/rp_filter:0
/proc/sys/net/ipv4/conf/eth0/shared_media:1
/proc/sys/net/ipv4/conf/eth0/secure_redirects:1
/proc/sys/net/ipv4/conf/eth0/accept_redirects:1
/proc/sys/net/ipv4/conf/eth0/mc_forwarding:0
/proc/sys/net/ipv4/conf/eth0/forwarding:1
/proc/sys/net/ipv4/conf/lo/promote_secondaries:0
/proc/sys/net/ipv4/conf/lo/force_igmp_version:0
/proc/sys/net/ipv4/conf/lo/disable_policy:1
/proc/sys/net/ipv4/conf/lo/disable_xfrm:1
/proc/sys/net/ipv4/conf/lo/arp_accept:0
/proc/sys/net/ipv4/conf/lo/arp_ignore:0
/proc/sys/net/ipv4/conf/lo/arp_announce:0
/proc/sys/net/ipv4/conf/lo/arp_filter:0
/proc/sys/net/ipv4/conf/lo/tag:0
/proc/sys/net/ipv4/conf/lo/log_martians:0
/proc/sys/net/ipv4/conf/lo/bootp_relay:0
/proc/sys/net/ipv4/conf/lo/medium_id:0
/proc/sys/net/ipv4/conf/lo/proxy_arp:0
/proc/sys/net/ipv4/conf/lo/accept_source_route:1
/proc/sys/net/ipv4/conf/lo/send_redirects:1
/proc/sys/net/ipv4/conf/lo/rp_filter:0
/proc/sys/net/ipv4/conf/lo/shared_media:1
/proc/sys/net/ipv4/conf/lo/secure_redirects:1
/proc/sys/net/ipv4/conf/lo/accept_redirects:1
/proc/sys/net/ipv4/conf/lo/mc_forwarding:0
/proc/sys/net/ipv4/conf/lo/forwarding:1

>>> c) You have no port 0 service set up?
>> Definitely not
>
> I see. Not! :)
>
>>>> Mar 29 11:01:48 dev1 kernel: IPVS: ip_vs_wlc_schedule(): Scheduling...
>>>> Mar 29 11:01:48 dev1 kernel: IPVS: WLC: server 120.1.1.1:25
>>>> activeconns 0 refcnt 1 weight 1 overhead 0
>>>> Mar 29 11:01:48 dev1 kernel: IPVS: Bind-dest TCP c:140.1.1.1:4042
>>>> v:100.1.1.2:25 d:120.1.1.1:25 fwd:T s:0 conn->flags:182
>>>> conn->refcnt:1 dest->refcnt:2
>>>> Mar 29 11:01:48 dev1 kernel: IPVS: Schedule fwd:T c:140.1.1.1:4042
>>>> v:100.1.1.2:25 d:120.1.1.1:25 conn->flags:1C2 conn->refcnt:2
>>>
>>> This looks like it would happily send it.
>>>
>>>> Mar 29 11:01:48 dev1 kernel: IPVS: TCP input [S...]
>>>> 120.1.1.1:25->140.1.1.1:4042 state: NONE->SYN_RECV conn->refcnt:2
>>>
>>> Ok, we do the state transition indicating that we've allocated the
>>> connection structure for the hash table entry.
>>>
>>>> Mar 29 11:01:51 dev1 kernel: IPVS: lookup/in TCP
>>>> 140.1.1.1:4042->100.1.1.2:25 hit
>>>
>>> Second SYN as seen in your non-functional tcpdump trace.
>>>
>>>> Mar 29 11:01:57 dev1 kernel: IPVS: lookup/in TCP
>>>> 140.1.1.1:4042->100.1.1.2:25 hit
>>>
>>> Third SYN as seen in your non-functional tcpdump trace.
>>>
>>>> Mar 29 11:02:04 dev1 kernel: IPVS: Unbind-dest TCP c:140.1.1.1:4039
>>>> v:100.1.1.2:25 d:120.1.1.2:25 fwd:T s:3 conn->flags:182
>>>> conn->refcnt:1 dest->refcnt:2
>>>
>>> This is not belonging to the trace above since it's port 4039 which
>>> must have been a test performed before you took the trace. Most
>>> likely this one ran into the normal 60 sec timeout.
>>>
>>>> I really am at a loss as to why this doesn't work, the debug log
>>>> seems to show IPVS passing traffic to mail 1 (120.1.1.1) however
>>>> the tcpdump for that server shows absolutely nothing. If anyone
>>>> can point me in the right direction here I would be very grateful.
>>>
>>> Can you show your routing information on your LVS? As well as the
>>> tun* device configuration in the proc-fs?
>>>
>> Sure, by LVS i'm going to assume you mean the broken load balancer.
>>
>> # route -n
>> Kernel IP routing table
>> Destination Gateway Genmask Flags Metric Ref
>> Use Iface
>> 100.1.1.0 0.0.0.0 255.255.255.0 U 0 0 0
>> eth0
>> 0.0.0.0 100.1.1.254 0.0.0.0 UG 0 0
>> 0 eth0
>
> Could you please send me the iproute2 related output, as indicated
> above? route -n does not show all the routing entries on a Linux box.
>
By this point you should have already skimmed the output ;)
>> # find /proc |grep tun
>
> Sidenote: You might not call that command like that too often on your
> productive server. I've seen nasty kernel OOPS more than once after
> such a stat()-intensive command.
>
Exxxxxcellent.
>> This is odd, tunl0 does exist:
>>
>> # ifconfig tunl0
>> tunl0 Link encap:IPIP Tunnel HWaddr
>> NOARP MTU:1480 Metric:1
>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>> collisions:0 txqueuelen:0
>> RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
>
> Sure, but it's not activated. Could you by any chance call following
> command on your box?
>
> ip link set dev tunl0 up
>
Mhmm this has been done, however I notice that on the working load
balancer, the tunl0 device is not visible in ifconfig output (i.e. is
not activated). Excuse me while I stay with my vintage ip-command
friends for a little while longer :)
>> Don't know why its absent from /proc.
>
> Since there are no IFF_RUNNING|IFF_UP flags set, there's no point in
> setting any entries for this virtual device in the proc-fs.
>
>> Thanks again for your assistance,
>
> Always when receiving such nice bug reports,
> Roberto Nibali, ratz

Kind Regards,

--
Mark Wadham
e: mark.wadham [at] areti t: +44 (0)20 8315 5800 f: +44 (0)20 8315 5801
Areti Internet Ltd., http://www.areti.net/

===================================================================
Areti Internet Ltd: BS EN ISO 9001:2000
Providing corporate Internet solutions for more than 10 years.
===================================================================

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://www.in-addr.de/mailman/listinfo/lvs-users


ratz at drugphish

Mar 29, 2007, 8:16 AM

Post #8 of 10 (735 views)
Permalink
Re: Problems getting LVS to work [In reply to]

>>> Yes, it was actually someone else who got it working before, and he
>>> is far too busy to assist me with the new one :)
>>
>> This is the part where your manager should probably call him back :).
>>
> It was actually the manager himself who set up the first one :)

Very well, so have you searched the LVS mailing list archive for his
name? :)

>> Sure, but there was no indication to which state of your test conducts
>> your quoted output pertained to. When you say "the new load balancer"
>> above, you do not mean a physically different machine to the "old load
>> balancer", do you?
>>
> There are two load balancers, the 'old' one which works and the 'new'
> one which doesn't. Here is the ipvsadm output for the new, broken load
> balancer:
>
> # ipvsadm -L -n
> IP Virtual Server version 1.2.1 (size=4096)
> Prot LocalAddress:Port Scheduler Flags
> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
> TCP 100.1.1.2:25 wlc
> -> 120.1.1.1:25 Tunnel 1 0 0
> -> 120.1.1.2:25 Tunnel 1 0 0

Ok.

>> That's not all :). You've only shown the filter table, but I'm also
>> interested in the mangle table.
>>
> # iptables -t mangle --list
> Chain PREROUTING (policy ACCEPT)
> target prot opt source destination
>
> Chain INPUT (policy ACCEPT)
> target prot opt source destination
>
> Chain FORWARD (policy ACCEPT)
> target prot opt source destination
>
> Chain OUTPUT (policy ACCEPT)
> target prot opt source destination
>
> Chain POSTROUTING (policy ACCEPT)
> target prot opt source destination

Thanks.

>>> # iproute2
>>> bash: iproute2: command not found
>>
>> It's the ip command output from the iproute2 framework I was looking for.
>>
>> This is the successor to ifconfig and route and netstat and whatnot.
>> The Linux world decided at one point in its history (around 1999) that
>> ifconfig/route/other networking setup tools are not appropriate
>> anymore and replaced them with the iproute2 framework. Unfortunately
>> the guy who started all this is a bloody genius and as such did two
>> things: a) completely forgot to document it, b) never told anyone
>> outside the kernel community about this, for years. So, if you find
>> time, invoke "man ip" on a recent enough Linux distribution of your
>> choice.
>>
> LOL

It's actually seriously tragic :).

>>> I built this server myself and never did anything with iproute2.. so
>>> I'm guessing the answer is no. Although I do believe Debian is evil
>>> and so I guess it could have possibly done this itself behind my back.
>>
>> Debian people hopefully do not have evil intentions, however could
>> pass along the output of:
>>
>> ip rule show
>> ip route show
>> ip link show
>> ip addr show
>> grep -r . /proc/sys/net/ipv4/conf/*
>>
> # ip rule show
> 0: from all lookup 255
> 32766: from all lookup main
> 32767: from all lookup default
> # ip route show
> 100.1.1.0/24 dev eth0 proto kernel scope link src 100.1.1.1
> default via 85.158.56.1 dev eth0

Gotcha: Fortunately your manager is too busy to find this. How does it
look on the working load balancer?

> # ip link show
> 1: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop
> link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
> 2: plip0: <POINTOPOINT,NOARP> mtu 1500 qdisc noop qlen 10
> link/ether fc:fc:fc:fc:fc:fc peer ff:ff:ff:ff:ff:ff
> 3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
> link/ether 00:04:76:16:12:a5 brd ff:ff:ff:ff:ff:ff
> 4: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
> link/ether 00:b0:d0:68:7f:2b brd ff:ff:ff:ff:ff:ff
> 5: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue
> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> 6: shaper0: <> mtu 1500 qdisc noop qlen 10
> link/ether
> 7: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop
> link/ether b6:e6:25:ed:c6:2d brd ff:ff:ff:ff:ff:ff
> 8: eql: <MASTER> mtu 576 qdisc noop qlen 5
> link/slip
> 9: teql0: <NOARP> mtu 1500 qdisc noop qlen 100
> link/void
> 10: tunl0: <NOARP> mtu 1480 qdisc noop
> link/ipip 0.0.0.0 brd 0.0.0.0

It might be hard for the LB to send packets along this device, when it's
not up.

> 11: gre0: <NOARP> mtu 1476 qdisc noop
> link/gre 0.0.0.0 brd 0.0.0.0
> # ip addr show
> 1: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop
> link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
> 2: plip0: <POINTOPOINT,NOARP> mtu 1500 qdisc noop qlen 10
> link/ether fc:fc:fc:fc:fc:fc peer ff:ff:ff:ff:ff:ff
> 3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
> link/ether 00:04:76:16:12:a5 brd ff:ff:ff:ff:ff:ff
> 4: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
> link/ether 00:b0:d0:68:7f:2b brd ff:ff:ff:ff:ff:ff
> inet 100.1.1.1/24 brd 100.1.1.255 scope global eth0
> inet 100.1.1.2/24 brd 100.1.1.255 scope global secondary eth0:0

Should be /32.

> /proc/sys/net/ipv4/conf/eth0/promote_secondaries:0
> /proc/sys/net/ipv4/conf/eth0/force_igmp_version:0
> /proc/sys/net/ipv4/conf/eth0/disable_policy:0
> /proc/sys/net/ipv4/conf/eth0/disable_xfrm:0
> /proc/sys/net/ipv4/conf/eth0/arp_accept:0
> /proc/sys/net/ipv4/conf/eth0/arp_ignore:0
> /proc/sys/net/ipv4/conf/eth0/arp_announce:0
> /proc/sys/net/ipv4/conf/eth0/arp_filter:0
> /proc/sys/net/ipv4/conf/eth0/tag:0
> /proc/sys/net/ipv4/conf/eth0/log_martians:0
> /proc/sys/net/ipv4/conf/eth0/bootp_relay:0
> /proc/sys/net/ipv4/conf/eth0/medium_id:0
> /proc/sys/net/ipv4/conf/eth0/proxy_arp:0
> /proc/sys/net/ipv4/conf/eth0/accept_source_route:1
> /proc/sys/net/ipv4/conf/eth0/send_redirects:1
> /proc/sys/net/ipv4/conf/eth0/rp_filter:0
> /proc/sys/net/ipv4/conf/eth0/shared_media:1
> /proc/sys/net/ipv4/conf/eth0/secure_redirects:1
> /proc/sys/net/ipv4/conf/eth0/accept_redirects:1
> /proc/sys/net/ipv4/conf/eth0/mc_forwarding:0
> /proc/sys/net/ipv4/conf/eth0/forwarding:1

This looks sane.

> /proc/sys/net/ipv4/conf/lo/promote_secondaries:0
> /proc/sys/net/ipv4/conf/lo/force_igmp_version:0
> /proc/sys/net/ipv4/conf/lo/disable_policy:1
> /proc/sys/net/ipv4/conf/lo/disable_xfrm:1
> /proc/sys/net/ipv4/conf/lo/arp_accept:0
> /proc/sys/net/ipv4/conf/lo/arp_ignore:0
> /proc/sys/net/ipv4/conf/lo/arp_announce:0
> /proc/sys/net/ipv4/conf/lo/arp_filter:0
> /proc/sys/net/ipv4/conf/lo/tag:0
> /proc/sys/net/ipv4/conf/lo/log_martians:0
> /proc/sys/net/ipv4/conf/lo/bootp_relay:0
> /proc/sys/net/ipv4/conf/lo/medium_id:0
> /proc/sys/net/ipv4/conf/lo/proxy_arp:0
> /proc/sys/net/ipv4/conf/lo/accept_source_route:1
> /proc/sys/net/ipv4/conf/lo/send_redirects:1
> /proc/sys/net/ipv4/conf/lo/rp_filter:0
> /proc/sys/net/ipv4/conf/lo/shared_media:1
> /proc/sys/net/ipv4/conf/lo/secure_redirects:1
> /proc/sys/net/ipv4/conf/lo/accept_redirects:1
> /proc/sys/net/ipv4/conf/lo/mc_forwarding:0
> /proc/sys/net/ipv4/conf/lo/forwarding:1

This as well.

>>> This is odd, tunl0 does exist:
>>>
>>> # ifconfig tunl0
>>> tunl0 Link encap:IPIP Tunnel HWaddr
>>> NOARP MTU:1480 Metric:1
>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>> collisions:0 txqueuelen:0
>>> RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
>>
>> Sure, but it's not activated. Could you by any chance call following
>> command on your box?
>>
>> ip link set dev tunl0 up
>>
> Mhmm this has been done, however I notice that on the working load
> balancer, the tunl0 device is not visible in ifconfig output (i.e. is
> not activated). Excuse me while I stay with my vintage ip-command
> friends for a little while longer :)

Your funeral :). Seriously though, this is puzzling. Unless I'm really
badly mistaken, tunl0 should be activated in order to have traffic go
through it, no? Unfortunately, I've not set up a LVS_TUN in 8 years :).

Could you send the ip link show output from the working LB?

Best regards,
Roberto Nibali, ratz
--
echo
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc
_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://www.in-addr.de/mailman/listinfo/lvs-users


mark.wadham at areti

Mar 29, 2007, 8:30 AM

Post #9 of 10 (735 views)
Permalink
Re: Problems getting LVS to work [In reply to]

Roberto Nibali wrote:
>>>> Yes, it was actually someone else who got it working before, and he
>>>> is far too busy to assist me with the new one :)
>>>
>>> This is the part where your manager should probably call him back :).
>>>
>> It was actually the manager himself who set up the first one :)
>
> Very well, so have you searched the LVS mailing list archive for his
> name? :)
>
No, but he did tell me earlier he hasn't really posted on here much.
He's some kind of alien that can just make things work with no effort.
>>> Sure, but there was no indication to which state of your test
>>> conducts your quoted output pertained to. When you say "the new load
>>> balancer" above, you do not mean a physically different machine to
>>> the "old load balancer", do you?
>>>
>> There are two load balancers, the 'old' one which works and the 'new'
>> one which doesn't. Here is the ipvsadm output for the new, broken
>> load balancer:
>>
>> # ipvsadm -L -n
>> IP Virtual Server version 1.2.1 (size=4096)
>> Prot LocalAddress:Port Scheduler Flags
>> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
>> TCP 100.1.1.2:25 wlc
>> -> 120.1.1.1:25 Tunnel 1 0 0
>> -> 120.1.1.2:25 Tunnel 1 0 0
>
> Ok.
>
>>> That's not all :). You've only shown the filter table, but I'm also
>>> interested in the mangle table.
>>>
>> # iptables -t mangle --list
>> Chain PREROUTING (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain INPUT (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain FORWARD (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain OUTPUT (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain POSTROUTING (policy ACCEPT)
>> target prot opt source destination
>
> Thanks.
>
>>>> # iproute2
>>>> bash: iproute2: command not found
>>>
>>> It's the ip command output from the iproute2 framework I was looking
>>> for.
>>>
>>> This is the successor to ifconfig and route and netstat and whatnot.
>>> The Linux world decided at one point in its history (around 1999)
>>> that ifconfig/route/other networking setup tools are not appropriate
>>> anymore and replaced them with the iproute2 framework. Unfortunately
>>> the guy who started all this is a bloody genius and as such did two
>>> things: a) completely forgot to document it, b) never told anyone
>>> outside the kernel community about this, for years. So, if you find
>>> time, invoke "man ip" on a recent enough Linux distribution of your
>>> choice.
>>>
>> LOL
>
> It's actually seriously tragic :).
>
>>>> I built this server myself and never did anything with iproute2..
>>>> so I'm guessing the answer is no. Although I do believe Debian is
>>>> evil and so I guess it could have possibly done this itself behind
>>>> my back.
>>>
>>> Debian people hopefully do not have evil intentions, however could
>>> pass along the output of:
>>>
>>> ip rule show
>>> ip route show
>>> ip link show
>>> ip addr show
>>> grep -r . /proc/sys/net/ipv4/conf/*
>>>
>> # ip rule show
>> 0: from all lookup 255
>> 32766: from all lookup main
>> 32767: from all lookup default
>> # ip route show
>> 100.1.1.0/24 dev eth0 proto kernel scope link src 100.1.1.1
>> default via 85.158.56.1 dev eth0
>
> Gotcha: Fortunately your manager is too busy to find this. How does it
> look on the working load balancer?
>
# ip rule show
0: from all lookup local
32766: from all lookup main
32767: from all lookup default
# ip route show
130.1.1.0/24 dev eth0 proto kernel scope link src 130.1.1.1
10.10.10.0/24 dev eth1 proto kernel scope link src 10.10.10.10
default via 130.1.1.254 dev eth0

>> # ip link show
>> 1: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop
>> link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
>> 2: plip0: <POINTOPOINT,NOARP> mtu 1500 qdisc noop qlen 10
>> link/ether fc:fc:fc:fc:fc:fc peer ff:ff:ff:ff:ff:ff
>> 3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
>> link/ether 00:04:76:16:12:a5 brd ff:ff:ff:ff:ff:ff
>> 4: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast
>> qlen 1000
>> link/ether 00:b0:d0:68:7f:2b brd ff:ff:ff:ff:ff:ff
>> 5: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue
>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>> 6: shaper0: <> mtu 1500 qdisc noop qlen 10
>> link/ether
>> 7: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop
>> link/ether b6:e6:25:ed:c6:2d brd ff:ff:ff:ff:ff:ff
>> 8: eql: <MASTER> mtu 576 qdisc noop qlen 5
>> link/slip
>> 9: teql0: <NOARP> mtu 1500 qdisc noop qlen 100
>> link/void
>> 10: tunl0: <NOARP> mtu 1480 qdisc noop
>> link/ipip 0.0.0.0 brd 0.0.0.0
>
> It might be hard for the LB to send packets along this device, when
> it's not up.
>
Okay, well its up now on my new load balancer, but definitely _not_ up
on the old load balancer which is working.
>> 11: gre0: <NOARP> mtu 1476 qdisc noop
>> link/gre 0.0.0.0 brd 0.0.0.0
>> # ip addr show
>> 1: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop
>> link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
>> 2: plip0: <POINTOPOINT,NOARP> mtu 1500 qdisc noop qlen 10
>> link/ether fc:fc:fc:fc:fc:fc peer ff:ff:ff:ff:ff:ff
>> 3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
>> link/ether 00:04:76:16:12:a5 brd ff:ff:ff:ff:ff:ff
>> 4: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast
>> qlen 1000
>> link/ether 00:b0:d0:68:7f:2b brd ff:ff:ff:ff:ff:ff
>> inet 100.1.1.1/24 brd 100.1.1.255 scope global eth0
>> inet 100.1.1.2/24 brd 100.1.1.255 scope global secondary eth0:0
>
> Should be /32.
>
Corrected, well spotted :)
>> /proc/sys/net/ipv4/conf/eth0/promote_secondaries:0
>> /proc/sys/net/ipv4/conf/eth0/force_igmp_version:0
>> /proc/sys/net/ipv4/conf/eth0/disable_policy:0
>> /proc/sys/net/ipv4/conf/eth0/disable_xfrm:0
>> /proc/sys/net/ipv4/conf/eth0/arp_accept:0
>> /proc/sys/net/ipv4/conf/eth0/arp_ignore:0
>> /proc/sys/net/ipv4/conf/eth0/arp_announce:0
>> /proc/sys/net/ipv4/conf/eth0/arp_filter:0
>> /proc/sys/net/ipv4/conf/eth0/tag:0
>> /proc/sys/net/ipv4/conf/eth0/log_martians:0
>> /proc/sys/net/ipv4/conf/eth0/bootp_relay:0
>> /proc/sys/net/ipv4/conf/eth0/medium_id:0
>> /proc/sys/net/ipv4/conf/eth0/proxy_arp:0
>> /proc/sys/net/ipv4/conf/eth0/accept_source_route:1
>> /proc/sys/net/ipv4/conf/eth0/send_redirects:1
>> /proc/sys/net/ipv4/conf/eth0/rp_filter:0
>> /proc/sys/net/ipv4/conf/eth0/shared_media:1
>> /proc/sys/net/ipv4/conf/eth0/secure_redirects:1
>> /proc/sys/net/ipv4/conf/eth0/accept_redirects:1
>> /proc/sys/net/ipv4/conf/eth0/mc_forwarding:0
>> /proc/sys/net/ipv4/conf/eth0/forwarding:1
>
> This looks sane.
>
>> /proc/sys/net/ipv4/conf/lo/promote_secondaries:0
>> /proc/sys/net/ipv4/conf/lo/force_igmp_version:0
>> /proc/sys/net/ipv4/conf/lo/disable_policy:1
>> /proc/sys/net/ipv4/conf/lo/disable_xfrm:1
>> /proc/sys/net/ipv4/conf/lo/arp_accept:0
>> /proc/sys/net/ipv4/conf/lo/arp_ignore:0
>> /proc/sys/net/ipv4/conf/lo/arp_announce:0
>> /proc/sys/net/ipv4/conf/lo/arp_filter:0
>> /proc/sys/net/ipv4/conf/lo/tag:0
>> /proc/sys/net/ipv4/conf/lo/log_martians:0
>> /proc/sys/net/ipv4/conf/lo/bootp_relay:0
>> /proc/sys/net/ipv4/conf/lo/medium_id:0
>> /proc/sys/net/ipv4/conf/lo/proxy_arp:0
>> /proc/sys/net/ipv4/conf/lo/accept_source_route:1
>> /proc/sys/net/ipv4/conf/lo/send_redirects:1
>> /proc/sys/net/ipv4/conf/lo/rp_filter:0
>> /proc/sys/net/ipv4/conf/lo/shared_media:1
>> /proc/sys/net/ipv4/conf/lo/secure_redirects:1
>> /proc/sys/net/ipv4/conf/lo/accept_redirects:1
>> /proc/sys/net/ipv4/conf/lo/mc_forwarding:0
>> /proc/sys/net/ipv4/conf/lo/forwarding:1
>
> This as well.
>
>>>> This is odd, tunl0 does exist:
>>>>
>>>> # ifconfig tunl0
>>>> tunl0 Link encap:IPIP Tunnel HWaddr
>>>> NOARP MTU:1480 Metric:1
>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>>> collisions:0 txqueuelen:0
>>>> RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
>>>
>>> Sure, but it's not activated. Could you by any chance call following
>>> command on your box?
>>>
>>> ip link set dev tunl0 up
>>>
>> Mhmm this has been done, however I notice that on the working load
>> balancer, the tunl0 device is not visible in ifconfig output (i.e. is
>> not activated). Excuse me while I stay with my vintage ip-command
>> friends for a little while longer :)
>
> Your funeral :). Seriously though, this is puzzling. Unless I'm really
> badly mistaken, tunl0 should be activated in order to have traffic go
> through it, no? Unfortunately, I've not set up a LVS_TUN in 8 years :).
>
> Could you send the ip link show output from the working LB?
>
# ip link show
1: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:14:22:09:85:39 brd ff:ff:ff:ff:ff:ff
2: eth1: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:14:22:09:85:3a brd ff:ff:ff:ff:ff:ff
3: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
4: tunl0: <NOARP> mtu 1480 qdisc noop
link/ipip 0.0.0.0 brd 0.0.0.0

Oh one thing I forgot to mention, the working load balancer is doing
some funky stuff with keepalived, which I believe is injecting IP's into
the interfaces. The config for that is:

global_defs {
#notification_email {
# someone [at] somewhere
#}
#notification_email_from devnull [at] areti
#smtp_server 127.0.0.1
#smtp_connect_timeout 30
lvs_id LOAD1
}

vrrp_sync_group VG1 {
group {
VI_1
}
#smtp_alert
}

vrrp_instance VI_1 {
state BACKUP
interface eth0
lvs_sync_daemon_interface eth1
virtual_router_id 51
priority 150
advert_int 1
authentication {
auth_type PASS
auth_pass blah
}
virtual_ipaddress {
130.1.1.2
}
preempt_delay 300
}

virtual_server_group VSG_1 {
130.1.1.2 25
}

virtual_server group VSG_1 {
delay_loop 6
lb_algo wlc
lb_kind TUN
# persistence_timeout 600
# persistence_granularity 255.255.255.0
protocol TCP

real_server 120.1.1.1 25 {
weight 100
SMTP_CHECK {
connect_timeout 6
retry 3
delay_before_retry 1
helo_name load1.areti.net
}
}
real_server 120.1.1.2 25 {
weight 100
SMTP_CHECK {
connect_timeout 6
retry 3
delay_before_retry 1
helo_name load1.areti.net
}
}
}


Thanks,

--
Mark Wadham
e: mark.wadham [at] areti t: +44 (0)20 8315 5800 f: +44 (0)20 8315 5801
Areti Internet Ltd., http://www.areti.net/

===================================================================
Areti Internet Ltd: BS EN ISO 9001:2000
Providing corporate Internet solutions for more than 10 years.
===================================================================

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://www.in-addr.de/mailman/listinfo/lvs-users


mark.wadham at areti

Mar 30, 2007, 4:11 AM

Post #10 of 10 (737 views)
Permalink
Re: Problems getting LVS to work [In reply to]

Roberto,

Thanks for your help, but I believe we have located the source of the
problem. Our load balancer is located in Manchester and the mail
servers are located in London, and it appears that our upstream
providers filter our traffic to prevent ip spoofing.

Thanks again for your help, and the honorary knighthood ;)

Mark

Roberto Nibali wrote:
>>>> Yes, it was actually someone else who got it working before, and he
>>>> is far too busy to assist me with the new one :)
>>>
>>> This is the part where your manager should probably call him back :).
>>>
>> It was actually the manager himself who set up the first one :)
>
> Very well, so have you searched the LVS mailing list archive for his
> name? :)
>
>>> Sure, but there was no indication to which state of your test
>>> conducts your quoted output pertained to. When you say "the new load
>>> balancer" above, you do not mean a physically different machine to
>>> the "old load balancer", do you?
>>>
>> There are two load balancers, the 'old' one which works and the 'new'
>> one which doesn't. Here is the ipvsadm output for the new, broken
>> load balancer:
>>
>> # ipvsadm -L -n
>> IP Virtual Server version 1.2.1 (size=4096)
>> Prot LocalAddress:Port Scheduler Flags
>> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
>> TCP 100.1.1.2:25 wlc
>> -> 120.1.1.1:25 Tunnel 1 0 0
>> -> 120.1.1.2:25 Tunnel 1 0 0
>
> Ok.
>
>>> That's not all :). You've only shown the filter table, but I'm also
>>> interested in the mangle table.
>>>
>> # iptables -t mangle --list
>> Chain PREROUTING (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain INPUT (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain FORWARD (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain OUTPUT (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain POSTROUTING (policy ACCEPT)
>> target prot opt source destination
>
> Thanks.
>
>>>> # iproute2
>>>> bash: iproute2: command not found
>>>
>>> It's the ip command output from the iproute2 framework I was looking
>>> for.
>>>
>>> This is the successor to ifconfig and route and netstat and whatnot.
>>> The Linux world decided at one point in its history (around 1999)
>>> that ifconfig/route/other networking setup tools are not appropriate
>>> anymore and replaced them with the iproute2 framework. Unfortunately
>>> the guy who started all this is a bloody genius and as such did two
>>> things: a) completely forgot to document it, b) never told anyone
>>> outside the kernel community about this, for years. So, if you find
>>> time, invoke "man ip" on a recent enough Linux distribution of your
>>> choice.
>>>
>> LOL
>
> It's actually seriously tragic :).
>
>>>> I built this server myself and never did anything with iproute2..
>>>> so I'm guessing the answer is no. Although I do believe Debian is
>>>> evil and so I guess it could have possibly done this itself behind
>>>> my back.
>>>
>>> Debian people hopefully do not have evil intentions, however could
>>> pass along the output of:
>>>
>>> ip rule show
>>> ip route show
>>> ip link show
>>> ip addr show
>>> grep -r . /proc/sys/net/ipv4/conf/*
>>>
>> # ip rule show
>> 0: from all lookup 255
>> 32766: from all lookup main
>> 32767: from all lookup default
>> # ip route show
>> 100.1.1.0/24 dev eth0 proto kernel scope link src 100.1.1.1
>> default via 85.158.56.1 dev eth0
>
> Gotcha: Fortunately your manager is too busy to find this. How does it
> look on the working load balancer?
>
>> # ip link show
>> 1: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop
>> link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
>> 2: plip0: <POINTOPOINT,NOARP> mtu 1500 qdisc noop qlen 10
>> link/ether fc:fc:fc:fc:fc:fc peer ff:ff:ff:ff:ff:ff
>> 3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
>> link/ether 00:04:76:16:12:a5 brd ff:ff:ff:ff:ff:ff
>> 4: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast
>> qlen 1000
>> link/ether 00:b0:d0:68:7f:2b brd ff:ff:ff:ff:ff:ff
>> 5: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue
>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>> 6: shaper0: <> mtu 1500 qdisc noop qlen 10
>> link/ether
>> 7: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop
>> link/ether b6:e6:25:ed:c6:2d brd ff:ff:ff:ff:ff:ff
>> 8: eql: <MASTER> mtu 576 qdisc noop qlen 5
>> link/slip
>> 9: teql0: <NOARP> mtu 1500 qdisc noop qlen 100
>> link/void
>> 10: tunl0: <NOARP> mtu 1480 qdisc noop
>> link/ipip 0.0.0.0 brd 0.0.0.0
>
> It might be hard for the LB to send packets along this device, when
> it's not up.
>
>> 11: gre0: <NOARP> mtu 1476 qdisc noop
>> link/gre 0.0.0.0 brd 0.0.0.0
>> # ip addr show
>> 1: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop
>> link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
>> 2: plip0: <POINTOPOINT,NOARP> mtu 1500 qdisc noop qlen 10
>> link/ether fc:fc:fc:fc:fc:fc peer ff:ff:ff:ff:ff:ff
>> 3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
>> link/ether 00:04:76:16:12:a5 brd ff:ff:ff:ff:ff:ff
>> 4: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast
>> qlen 1000
>> link/ether 00:b0:d0:68:7f:2b brd ff:ff:ff:ff:ff:ff
>> inet 100.1.1.1/24 brd 100.1.1.255 scope global eth0
>> inet 100.1.1.2/24 brd 100.1.1.255 scope global secondary eth0:0
>
> Should be /32.
>
>> /proc/sys/net/ipv4/conf/eth0/promote_secondaries:0
>> /proc/sys/net/ipv4/conf/eth0/force_igmp_version:0
>> /proc/sys/net/ipv4/conf/eth0/disable_policy:0
>> /proc/sys/net/ipv4/conf/eth0/disable_xfrm:0
>> /proc/sys/net/ipv4/conf/eth0/arp_accept:0
>> /proc/sys/net/ipv4/conf/eth0/arp_ignore:0
>> /proc/sys/net/ipv4/conf/eth0/arp_announce:0
>> /proc/sys/net/ipv4/conf/eth0/arp_filter:0
>> /proc/sys/net/ipv4/conf/eth0/tag:0
>> /proc/sys/net/ipv4/conf/eth0/log_martians:0
>> /proc/sys/net/ipv4/conf/eth0/bootp_relay:0
>> /proc/sys/net/ipv4/conf/eth0/medium_id:0
>> /proc/sys/net/ipv4/conf/eth0/proxy_arp:0
>> /proc/sys/net/ipv4/conf/eth0/accept_source_route:1
>> /proc/sys/net/ipv4/conf/eth0/send_redirects:1
>> /proc/sys/net/ipv4/conf/eth0/rp_filter:0
>> /proc/sys/net/ipv4/conf/eth0/shared_media:1
>> /proc/sys/net/ipv4/conf/eth0/secure_redirects:1
>> /proc/sys/net/ipv4/conf/eth0/accept_redirects:1
>> /proc/sys/net/ipv4/conf/eth0/mc_forwarding:0
>> /proc/sys/net/ipv4/conf/eth0/forwarding:1
>
> This looks sane.
>
>> /proc/sys/net/ipv4/conf/lo/promote_secondaries:0
>> /proc/sys/net/ipv4/conf/lo/force_igmp_version:0
>> /proc/sys/net/ipv4/conf/lo/disable_policy:1
>> /proc/sys/net/ipv4/conf/lo/disable_xfrm:1
>> /proc/sys/net/ipv4/conf/lo/arp_accept:0
>> /proc/sys/net/ipv4/conf/lo/arp_ignore:0
>> /proc/sys/net/ipv4/conf/lo/arp_announce:0
>> /proc/sys/net/ipv4/conf/lo/arp_filter:0
>> /proc/sys/net/ipv4/conf/lo/tag:0
>> /proc/sys/net/ipv4/conf/lo/log_martians:0
>> /proc/sys/net/ipv4/conf/lo/bootp_relay:0
>> /proc/sys/net/ipv4/conf/lo/medium_id:0
>> /proc/sys/net/ipv4/conf/lo/proxy_arp:0
>> /proc/sys/net/ipv4/conf/lo/accept_source_route:1
>> /proc/sys/net/ipv4/conf/lo/send_redirects:1
>> /proc/sys/net/ipv4/conf/lo/rp_filter:0
>> /proc/sys/net/ipv4/conf/lo/shared_media:1
>> /proc/sys/net/ipv4/conf/lo/secure_redirects:1
>> /proc/sys/net/ipv4/conf/lo/accept_redirects:1
>> /proc/sys/net/ipv4/conf/lo/mc_forwarding:0
>> /proc/sys/net/ipv4/conf/lo/forwarding:1
>
> This as well.
>
>>>> This is odd, tunl0 does exist:
>>>>
>>>> # ifconfig tunl0
>>>> tunl0 Link encap:IPIP Tunnel HWaddr
>>>> NOARP MTU:1480 Metric:1
>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>>> collisions:0 txqueuelen:0
>>>> RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
>>>
>>> Sure, but it's not activated. Could you by any chance call following
>>> command on your box?
>>>
>>> ip link set dev tunl0 up
>>>
>> Mhmm this has been done, however I notice that on the working load
>> balancer, the tunl0 device is not visible in ifconfig output (i.e. is
>> not activated). Excuse me while I stay with my vintage ip-command
>> friends for a little while longer :)
>
> Your funeral :). Seriously though, this is puzzling. Unless I'm really
> badly mistaken, tunl0 should be activated in order to have traffic go
> through it, no? Unfortunately, I've not set up a LVS_TUN in 8 years :).
>
> Could you send the ip link show output from the working LB?
>
> Best regards,
> Roberto Nibali, ratz


--
Mark Wadham
e: mark.wadham [at] areti t: +44 (0)20 8315 5800 f: +44 (0)20 8315 5801
Areti Internet Ltd., http://www.areti.net/

===================================================================
Areti Internet Ltd: BS EN ISO 9001:2000
Providing corporate Internet solutions for more than 10 years.
===================================================================

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://www.in-addr.de/mailman/listinfo/lvs-users

Linux Virtual Server users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.