
csmith at nighthawkrad
Sep 15, 2009, 3:19 PM
Post #1 of 10
(1529 views)
Permalink
|
|
[lvs-users] Apparent MTU problem using LVS-DR and Windows 2003 RealServers
|
|
I have a somewhat weird (at least to me) problem with an LVS-DR setup that has some Windows 2003 RealServers. Firstly, this whole setup is VMs running inside an ESXi 4.0 host, in case that sets of anyone's alarm bells up front. Load Balancers are Centos 5 x86-64, using heartbeat (for failover) and ldirectord (to configure IPVS). The config file for the VIP on the LBs is: [PHX (UTC-0600) root [at] lb-test0 ~]# cat /etc/ha.d/conf/10.183.3.112 autoreload = yes checkinterval = 30 checktimeout = 3 callback = "/etc/ha.d/resource.d/sync_config.sh" # HTTP. Mainly used for testing virtual = 10.183.3.112:80 ## IMPORTANT. The following directives for the ## above virtual/service IP definition ***MUST*** be ## indented by _at least_ four (4) spaces *OR* a single tab. protocol = tcp scheduler = rr #persistent=600 real = 10.183.3.113:80 gate #real = 10.183.3.114:80 gate #real = 10.183.3.115:80 gate checktype = connect quiescent = no # 104. Standard DICOM port virtual = 10.183.3.112:104 ## IMPORTANT. The following directives for the ## above virtual/service IP definition ***MUST*** be ## indented by _at least_ four (4) spaces *OR* a single tab. protocol = tcp scheduler = rr #persistent=600 real = 10.183.3.113:104 gate #real = 10.183.3.114:104 gate #real = 10.183.3.115:104 gate checktype = ping quiescent = no (I have temporarily disabled two of the RealServers until I get it working with just one.) I have configured the MS loopback adapter on the Windows RealServers, with the VIP (10.183.2.112) and a netmask of 255.255.255.255. Since port 80 balances fine - at least so far as I've tested by refreshing a page in links a few dozen times and watching it round-robin between the different servers - I'm pretty sure the basic config is fine. However, the balancing of DICOM associations on port 104 does not. As far as I know, are just a simple TCP connection so I'm not sure why it isn't working. LBs, RealServers and Clients are all on the same subnet. I have confirmed that sending directly to the RealServer works. Basically, the data transmission hangs and I see this from tcpdump on the LB: 15:14:26.956058 arp who-has 10.183.3.112 tell 10.183.3.241 15:14:26.956467 arp reply 10.183.3.112 is-at 00:0c:29:fb:40:f1 15:14:26.956483 IP 10.183.3.241.1122 > 10.183.3.112.104: S 4050535759:4050535759(0) win 65535 <mss 1460,nop,nop,sackOK> 15:14:26.956507 IP 10.183.3.241.1122 > 10.183.3.112.104: S 4050535759:4050535759(0) win 65535 <mss 1460,nop,nop,sackOK> 15:14:26.956115 arp who-has 10.183.3.241 tell 10.183.3.113 15:14:26.956122 arp reply 10.183.3.241 is-at 00:50:56:99:6c:90 15:14:26.956171 IP 10.183.3.112.104 > 10.183.3.241.1122: S 1649263224:1649263224(0) ack 4050535760 win 40000 <mss 1460,nop,nop,sackOK> 15:14:26.956173 IP 10.183.3.241.1122 > 10.183.3.112.104: . ack 1 win 65535 15:14:26.956177 IP 10.183.3.241.1122 > 10.183.3.112.104: . ack 1 win 65535 15:14:26.956336 IP 10.183.3.112.104 > 10.183.3.241.1122: . ack 1 win 40000 15:14:26.969979 IP 10.183.3.241.1122 > 10.183.3.112.104: P 1:217(216) ack 1 win 65535 15:14:26.969985 IP 10.183.3.241.1122 > 10.183.3.112.104: P 1:217(216) ack 1 win 65535 15:14:26.970896 arp who-has 10.183.3.121 tell 10.183.3.113 15:14:26.970992 arp reply 10.183.3.121 is-at 00:50:56:99:78:51 15:14:26.970996 IP 10.183.3.113.1148 > 10.183.3.121.104: S 4209027996:4209027996(0) win 65535 <mss 1460,nop,nop,sackOK> 15:14:26.970998 IP 10.183.3.121.104 > 10.183.3.113.1148: S 1556555174:1556555174(0) ack 4209027997 win 64240 <mss 1460,nop,nop,sackOK> 15:14:26.971040 IP 10.183.3.113.1148 > 10.183.3.121.104: . ack 1 win 65535 15:14:26.971313 IP 10.183.3.113.1148 > 10.183.3.121.104: P 1:214(213) ack 1 win 65535 15:14:27.008491 IP 10.183.3.121.104 > 10.183.3.113.1148: P 1:7(6) ack 214 win 64027 15:14:27.128736 IP 10.183.3.112.104 > 10.183.3.241.1122: . ack 217 win 39784 15:14:27.136524 IP 10.183.3.113.1148 > 10.183.3.121.104: . ack 7 win 65529 15:14:27.136580 IP 10.183.3.121.104 > 10.183.3.113.1148: P 7:188(181) ack 214 win 64027 15:14:27.136710 IP 10.183.3.112.104 > 10.183.3.241.1122: P 1:185(184) ack 217 win 39784 15:14:27.146207 IP 10.183.3.241.1122 > 10.183.3.112.104: P 217:365(148) ack 185 win 65351 15:14:27.146221 IP 10.183.3.241.1122 > 10.183.3.112.104: P 217:365(148) ack 185 win 65351 15:14:27.146337 IP 10.183.3.113.1148 > 10.183.3.121.104: P 214:362(148) ack 188 win 65348 15:14:27.150053 IP 10.183.3.241.1122 > 10.183.3.112.104: . 365:3285(2920) ack 185 win 65351 15:14:27.150073 IP 10.183.3.112 > 10.183.3.241: ICMP 10.183.3.112 unreachable - need to frag (mtu 1500), length 556 15:14:27.329941 IP 10.183.3.112.104 > 10.183.3.241.1122: . ack 365 win 39636 15:14:27.329948 IP 10.183.3.241.1122 > 10.183.3.112.104: . 3285:6205(2920) ack 185 win 65351 15:14:27.329965 IP 10.183.3.112 > 10.183.3.241: ICMP 10.183.3.112 unreachable - need to frag (mtu 1500), length 556 15:14:27.344068 IP 10.183.3.121.104 > 10.183.3.113.1148: . ack 362 win 63879 15:14:29.510399 IP 10.183.3.241.1122 > 10.183.3.112.104: . 365:1825(1460) ack 185 win 65351 15:14:29.510421 IP 10.183.3.241.1122 > 10.183.3.112.104: . 365:1825(1460) ack 185 win 65351 15:14:29.643488 IP 10.183.3.112.104 > 10.183.3.241.1122: . ack 1825 win 40000 15:14:29.643493 IP 10.183.3.241.1122 > 10.183.3.112.104: . 1825:4745(2920) ack 185 win 65351 15:14:29.643507 IP 10.183.3.112 > 10.183.3.241: ICMP 10.183.3.112 unreachable - need to frag (mtu 1500), length 556 15:14:32.154209 arp who-has 10.183.3.241 tell 10.183.3.8 15:14:32.154336 arp reply 10.183.3.241 is-at 00:50:56:99:6c:90 15:14:32.884189 IP 10.183.3.8.36599 > 10.183.3.113.80: S 497646369:497646369(0) win 5840 <mss 1460,sackOK,timestamp 413727 0,nop,wscale 7> 15:14:32.884272 IP 10.183.3.113.80 > 10.183.3.8.36599: S 2028145611:2028145611(0) ack 497646370 win 16384 <mss 1460,nop,wscale 0,nop,nop,timestamp 0 0,nop,nop,sackOK> 15:14:32.884325 IP 10.183.3.8.36599 > 10.183.3.113.80: . ack 1 win 46 <nop,nop,timestamp 413727 0> 15:14:32.884402 IP 10.183.3.8.36599 > 10.183.3.113.80: F 1:1(0) ack 1 win 46 <nop,nop,timestamp 413727 0> 15:14:32.884457 IP 10.183.3.113.80 > 10.183.3.8.36599: . ack 2 win 65535 <nop,nop,timestamp 14463 413727> 15:14:32.884588 IP 10.183.3.113.80 > 10.183.3.8.36599: F 1:1(0) ack 2 win 65535 <nop,nop,timestamp 14463 413727> 15:14:32.884596 IP 10.183.3.8.36599 > 10.183.3.113.80: . ack 2 win 46 <nop,nop,timestamp 413727 14463> 15:14:32.884943 IP 10.183.3.8 > 10.183.3.113: ICMP echo request, id 4333, seq 1, length 72 15:14:32.885001 IP 10.183.3.113 > 10.183.3.8: ICMP echo reply, id 4333, seq 1, length 72 15:14:33.994888 IP 10.183.3.241.1122 > 10.183.3.112.104: . 1825:3285(1460) ack 185 win 65351 15:14:33.994901 IP 10.183.3.241.1122 > 10.183.3.112.104: . 1825:3285(1460) ack 185 win 65351 15:14:34.169938 IP 10.183.3.112.104 > 10.183.3.241.1122: . ack 3285 win 40000 15:14:34.169944 IP 10.183.3.241.1122 > 10.183.3.112.104: . 3285:6205(2920) ack 185 win 65351 15:14:34.169959 IP 10.183.3.112 > 10.183.3.241: ICMP 10.183.3.112 unreachable - need to frag (mtu 1500), length 556 15:14:40.924150 IP 10.183.3.113.1143 > 10.181.3.12.80: R 3625640816:3625640816(0) ack 1539269318 win 0 15:14:42.335482 IP 10.183.3.113.137 > 10.183.3.255.137: NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST 15:14:42.335746 arp who-has 10.183.3.113 tell 10.183.3.10 15:14:42.335754 arp who-has 10.183.3.113 tell 10.183.3.10 15:14:42.335757 arp who-has 10.183.3.113 tell 10.183.3.10 15:14:42.335759 arp who-has 10.183.3.113 tell 10.183.3.10 15:14:42.335761 arp reply 10.183.3.113 is-at 00:50:56:99:66:4a 15:14:42.335911 IP 10.183.3.10.137 > 10.183.3.113.137: NBT UDP PACKET(137): QUERY; POSITIVE; RESPONSE; UNICAST 15:14:42.336130 IP 10.183.3.10.137 > 10.183.3.113.137: NBT UDP PACKET(137): QUERY; POSITIVE; RESPONSE; UNICAST 15:14:42.963860 IP 10.183.3.241.1122 > 10.183.3.112.104: . 3285:4745(1460) ack 185 win 65351 15:14:42.963874 IP 10.183.3.241.1122 > 10.183.3.112.104: . 3285:4745(1460) ack 185 win 65351 15:14:43.122349 IP 10.183.3.112.104 > 10.183.3.241.1122: . ack 4745 win 40000 15:14:43.122361 IP 10.183.3.241.1122 > 10.183.3.112.104: . 4745:7665(2920) ack 185 win 65351 15:14:43.122373 IP 10.183.3.112 > 10.183.3.241: ICMP 10.183.3.112 unreachable - need to frag (mtu 1500), length 556 Then it just continues with the 'need to frag' messages indefinitely. I had a bit of a look around on Google and the list archives, but all the postings I could find were referring to using LVS-TUN, not LVS-DR. Has anyone seen this problem before ? I'm assuming it has something to do with the larger data transfers of the DICOM association needing packets to fragment, but the smaller HTTP requests do not, but surely that shouldn't be a problem with all hosts on the same vlan ? Cheers, Chris -- Christopher Smith UNIX Team Leader Nighthawk Radiology Services Limmatquai 4, 6th Floor 8001 Zurich, Switzerland http://www.nighthawkrad.net Sydney Fax: +61 2 8211 2333 Zurich Fax: +41 43 497 3301 USA Toll free: 866 241 6635 Email: csmith [at] nighthawkrad IP Extension: 8163 Sydney Phone: +61 2 8211 2363 Sydney Mobile: +61 4 0739 7563 Zurich Phone: +41 44 267 3363 Zurich Mobile: +41 79 550 2715 All phones forwarded to my current location, however, please consider the local time in Zurich before calling from abroad. CONFIDENTIALITY NOTICE: This email, including any attachments, contains information from NightHawk Radiology Services, which may be confidential or privileged. The information is intended to be for the use of the individual or entity named above. If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited. If you have received this email in error, please notify NightHawk Radiology Services immediately by forwarding message to postmaster [at] nighthawkrad and destroy all electronic and hard copies of the communication, including attachments. _______________________________________________ Please read the documentation before posting - it's available at: http://www.linuxvirtualserver.org/ LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer Send requests to lvs-users-request [at] LinuxVirtualServer or go to http://lists.graemef.net/mailman/listinfo/lvs-users
|