
sr at swisscenter
Apr 30, 2012, 2:04 AM
Post #8 of 10
(764 views)
Permalink
|
|
Re: XCP 1.5 BETA BUG: Slow performance with 10Gbit network cards
[In reply to]
|
|
Yeah, I had tried that but it didn't make a difference at all, and on the same hardware but different version of xen it works quite well. It looks like it's a software oriented problem not hw :/ Sébastien On 30.04.2012 11:02, Uli Stärk wrote: > Did you try to disable all power saving functions in bios? Power saving has a big impact on interrupt handling :( > > ________________________________________ > Von: Sébastien Riccio [sr [at] swisscenter] > Gesendet: Montag, 30. April 2012 10:38 > Bis: Uli Stärk; xen-api [at] lists > Betreff: Re: AW: [Xen-API] XCP 1.5 BETA BUG: Slow performance with 10Gbit network cards > > Hi, > > Yes I've almost tried everything that is on the network tweaking guide: > > http://wiki.xen.org/wiki/Network_Throughput_and_Performance_Guide > > But the bottleneck seems to be the xen hypervisor used in xcp and xenserver. > > On the same machine but with xcp-xapi (over debian) I get almost normal > 10gbit/s speeds from dom0. > > What is kinda relevant is that even doing an iperf "localhost to > localhost" it's stuck to that 3gbit in a xcp or xenserver dom0... > > XCP: > > [root [at] xen-blade1 ~]# iperf -c localhost > ------------------------------------------------------------ > Client connecting to localhost, TCP port 5001 > ------------------------------------------------------------ > [ 3] local 127.0.0.1 port 43138 connected with 127.0.0.1 port 5001 > [ 3] 0.0-10.0 sec 3.43 GBytes 2.94 Gbits/sec > > Kronos (xapi on debian) > root [at] xen-blade1:~# iperf -c localhost > ------------------------------------------------------------ > Client connecting to localhost, TCP port 5001 > ------------------------------------------------------------ > [ 3] local 127.0.0.1 port 34438 connected with 127.0.0.1 port 5001 > [ ID] Interval Transfer Bandwidth > [ 3] 0.0-10.0 sec 18.9 GBytes 16.3 Gbits/sec > > So I'll never get more than 3gbit/s in one thread on xenserver or xcp... > and that is kinda ... bad :/ > > Not sure yet if it's a limitation of the dom0 kernel or the hypervisor > itself... I tend to think it's the hypervisor. > > xcp 1.5 > > host : xen-blade13 > release : 2.6.32.12-0.7.1.xs1.4.90.530.170661xen > version : #1 SMP Sat Apr 28 21:26:23 CEST 2012 > machine : i686 > nr_cpus : 16 > nr_nodes : 2 > cores_per_socket : 4 > threads_per_core : 2 > cpu_mhz : 2394 > hw_caps : > bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000 > virt_caps : hvm hvm_directio > total_memory : 32758 > free_memory : 31516 > free_cpus : 0 > xen_major : 4 > xen_minor : 1 > xen_extra : .1 > xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 > hvm-3.0-x86_32p hvm-3.0-x86_64 > xen_scheduler : credit > xen_pagesize : 4096 > platform_params : virt_start=0xfdc00000 > xen_changeset : unavailable > xen_commandline : dom0_mem=752M lowmem_emergency_pool=1M > crashkernel=64M [at] 32 console= vga=mode-0x0311 dom0_max_vcpus=8 > cc_compiler : gcc version 4.1.2 20080704 (Red Hat 4.1.2-46) > cc_compile_by : root > cc_compile_domain : uk.xensource.com > cc_compile_date : Mon Feb 6 19:01:42 EST 2012 > xend_config_format : 4 > > xcp-xapi (debian) > host : xen-blade11 > release : 3.2.0-2-686-pae > version : #1 SMP Sun Apr 15 17:56:31 UTC 2012 > machine : i686 > nr_cpus : 8 > nr_nodes : 2 > cores_per_socket : 4 > threads_per_core : 1 > cpu_mhz : 2394 > hw_caps : > bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000 > virt_caps : hvm hvm_directio > total_memory : 32758 > free_memory : 30560 > free_cpus : 0 > xen_major : 4 > xen_minor : 1 > xen_extra : .2 > xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 > hvm-3.0-x86_32p hvm-3.0-x86_64 > xen_scheduler : credit > xen_pagesize : 4096 > platform_params : virt_start=0xfdc00000 > xen_changeset : unavailable > xen_commandline : placeholder dom0_mem=1024M > cc_compiler : gcc version 4.6.2 (Debian 4.6.2-6) > cc_compile_by : waldi > cc_compile_domain : debian.org > cc_compile_date : Sun Dec 11 13:32:25 UTC 2011 > xend_config_format : 4 > > I see not much difference except 4.1.1 for xcp 4.1.2 for xcp-xapi with > same scheduler (credit) > > Also I found an old post (2008) of someone having the same symptoms back > in the years: > > http://old-list-archives.xen.org/archives/html/xen-users/2008-12/msg00508.html > > but didn't find any follow ups ... > > Cheers, > Sébastien > > > On 30.04.2012 10:13, Uli Stärk wrote: >> did you enable jumbo frames? >> >> ip link set eth2 mtu 9000 >> >> >> ________________________________________ >> Von: xen-api-bounces [at] lists [xen-api-bounces [at] lists]" im Auftrag von"Sébastien Riccio [sr [at] swisscenter] >> Gesendet: Sonntag, 29. April 2012 19:23 >> Bis: xen-api [at] lists >> Betreff: Re: [Xen-API] XCP 1.5 BETA BUG: Slow performance with 10Gbit network cards >> >> Another discovery: >> >> On XCP 1.5 >> >> iperf on dom0 to localhost (loopback) >> >> Client connecting to localhost, TCP port 5001 >> TCP window size: 64.0 KByte (default) >> ------------------------------------------------------------ >> [ 3] local 127.0.0.1 port 38278 connected with 127.0.0.1 port 5001 >> ^C[ 3] 0.0- 5.2 sec 1.79 GBytes 2.97 Gbits/sec >> >> Limited to 3gbit >> >> on xapi/debian >> >> root [at] krono:~# iperf -c 10.50.50.111 >> ------------------------------------------------------------ >> Client connecting to 10.50.50.111, TCP port 5001 >> TCP window size: 167 KByte (default) >> ------------------------------------------------------------ >> [ 3] local 127.0.0.1 port 41137 connected with 127.0.0.1 port 5001 >> [ ID] Interval Transfer Bandwidth[ 3] 0.0- 5.6 sec 13.1 >> GBytes 20.0 Gbits/sec >> >> clearly not a nic issue.... >> >> Cheers, >> Sébastien >> >> >> >> >> >> >> On 29.04.2012 17:28, Sébastien Riccio wrote: >>> Me again (sorry), >>> >>> Still trying to sort things out to understand why it's slow under xcp, >>> I've installed xcp on debian (project kronos), and did an iperf from dom0 >>> >>> [ 3] 0.0- 1.0 sec 623 MBytes 5.23 Gbits/sec >>> [ 3] 1.0- 2.0 sec 817 MBytes 6.85 Gbits/sec >>> [ 3] 2.0- 3.0 sec 818 MBytes 6.86 Gbits/sec >>> [ 3] 3.0- 4.0 sec 816 MBytes 6.84 Gbits/sec >>> [ 3] 4.0- 5.0 sec 818 MBytes 6.86 Gbits/sec >>> [ 3] 5.0- 6.0 sec 818 MBytes 6.86 Gbits/sec >>> [ 3] 6.0- 7.0 sec 816 MBytes 6.85 Gbits/sec >>> >>> It's not getting to the 10gibt/s so it looks that it's related to the >>> xen hypervisor. >>> But it still 2x better than with xcp/xenserver. >>> >>> Any ideas how to improve this? >>> >>> Cheers, >>> Sébastien >>> >>> On 29.04.2012 14:44, Sébastien Riccio wrote: >>>> Hi again, again :) >>>> >>>> Something strange : >>>> >>>> If i issue parallel threads with iperf (8 in this example, the >>>> 10Gbit/s speed is reached) >>>> >>>> [ 5] 0.0- 1.0 sec 151 MBytes 1.27 Gbits/sec >>>> [ 6] 0.0- 1.0 sec 178 MBytes 1.49 Gbits/sec >>>> [ 7] 0.0- 1.0 sec 136 MBytes 1.14 Gbits/sec >>>> [ 8] 0.0- 1.0 sec 189 MBytes 1.59 Gbits/sec >>>> [ 10] 0.0- 1.0 sec 161 MBytes 1.35 Gbits/sec >>>> [ 4] 0.0- 1.0 sec 124 MBytes 1.04 Gbits/sec >>>> [ 3] 0.0- 1.0 sec 178 MBytes 1.49 Gbits/sec >>>> [ 9] 0.0- 1.0 sec 74.8 MBytes 627 Mbits/sec >>>> [SUM] 0.0- 1.0 sec 1.16 GBytes 10.0 Gbits/sec >>>> >>>> So it's surely not a NIC/driver issue but maybe a limitation in dom0 ? >>>> It's really not efficient as we connect the iscsi storage to the >>>> filer from dom0, so it will always be a single thread, so no >>>> possibility to get the full 10Gbit bw for our storage. >>>> >>>> Is there a way to get rid of that (bug?) limitation ? >>>> >>>> Cheers, >>>> Sébastien >>>> >>>> >>>> >>>> >>>> On 29.04.2012 14:23, Sébastien Riccio wrote: >>>>> Hi again, >>>>> >>>>> So I did my test with a debian 6 on the same hardware than I'm using >>>>> to try the xcp 1.5. >>>>> >>>>> Speed to the filer via the 10gig nic: >>>>> >>>>> [ 3] 0.0- 8.0 sec 8.71 GBytes 9.41 Gbits/sec >>>>> >>>>> I would tend to conclude that there is something wrong with the >>>>> XCP/XenServer kernel and the 10gigs interfaces (at least broadcom). >>>>> >>>>> (Note that our Dell hardware is certified as supported by XenServer, >>>>> so it seems not so true :/) >>>>> >>>>> modinfo for bnx2x on debian 6 >>>>> >>>>> filename: /lib/modules/2.6.32-5-amd64/kernel/drivers/net/bnx2x.ko >>>>> firmware: bnx2x-e1h-5.0.21.0.fw >>>>> firmware: bnx2x-e1-5.0.21.0.fw >>>>> version: 1.52.1 >>>>> license: GPL >>>>> description: Broadcom NetXtreme II BCM57710/57711/57711E Driver >>>>> author: Eliezer Tamir >>>>> srcversion: 050FF749F80E43CB5873BC9 >>>>> alias: pci:v000014E4d00001650sv*sd*bc*sc*i* >>>>> alias: pci:v000014E4d0000164Fsv*sd*bc*sc*i* >>>>> alias: pci:v000014E4d0000164Esv*sd*bc*sc*i* >>>>> depends: mdio,libcrc32c >>>>> vermagic: 2.6.32-5-amd64 SMP mod_unload modversions >>>>> parm: multi_mode: Multi queue mode (0 Disable; 1 Enable >>>>> (default)) (int) >>>>> parm: num_rx_queues: Number of Rx queues for multi_mode=1 >>>>> (default is half number of CPUs) (int) >>>>> parm: num_tx_queues: Number of Tx queues for multi_mode=1 >>>>> (default is half number of CPUs) (int) >>>>> parm: disable_tpa: Disable the TPA (LRO) feature (int) >>>>> parm: int_mode: Force interrupt mode (1 INT#x; 2 MSI) (int) >>>>> parm: dropless_fc: Pause on exhausted host ring (int) >>>>> parm: poll: Use polling (for debug) (int) >>>>> parm: mrrs: Force Max Read Req Size (0..3) (for debug) (int) >>>>> parm: debug: Default debug msglevel (int) >>>>> >>>>> modinfo for bnx2x on xcp 1.5 >>>>> >>>>> [root [at] xen-blade1 ~]# modinfo bnx2x >>>>> filename: >>>>> /lib/modules/2.6.32.12-0.7.1.xs1.4.90.530.170661xen/kernel/drivers/net/bnx2x.ko >>>>> version: 1.62.17 >>>>> license: GPL >>>>> description: Broadcom NetXtreme II >>>>> BCM57710/57711/57711E/57712/57712E Driver >>>>> author: Eliezer Tamir >>>>> srcversion: 13EA885EBE061121BAEC28D >>>>> alias: pci:v000014E4d00001663sv*sd*bc*sc*i* >>>>> alias: pci:v000014E4d00001662sv*sd*bc*sc*i* >>>>> alias: pci:v000014E4d00001650sv*sd*bc*sc*i* >>>>> alias: pci:v000014E4d0000164Fsv*sd*bc*sc*i* >>>>> alias: pci:v000014E4d0000164Esv*sd*bc*sc*i* >>>>> depends: mdio >>>>> vermagic: 2.6.32.12-0.7.1.xs1.4.90.530.170661xen SMP >>>>> mod_unload modversions Xen 686 >>>>> parm: multi_mode: Multi queue mode (0 Disable; 1 Enable >>>>> (default); 2 VLAN PRI; 3 E1HOV PRI; 4 IP DSCP) (int) >>>>> parm: pri_map: Priority to HW queue mapping (int) >>>>> parm: qs_per_cos: Number of queues per HW queue (int) >>>>> parm: cos_min_rate: Weight for RR between HW queues (int) >>>>> parm: num_queues: Number of queues for multi_mode=1 >>>>> (default is as a number of CPUs) (int) >>>>> parm: disable_iscsi_ooo: Disable iSCSI OOO support (int) >>>>> parm: disable_tpa: Disable the TPA (LRO) feature (int) >>>>> parm: int_mode: Force interrupt mode other than MSI-X (1 >>>>> INT#x; 2 MSI) (int) >>>>> parm: dropless_fc: Pause on exhausted host ring (int) >>>>> parm: poll: Use polling (for debug) (int) >>>>> parm: mrrs: Force Max Read Req Size (0..3) (for debug) (int) >>>>> parm: debug: Default debug msglevel (int) >>>>> >>>>> Seems that the xcp 1.5 got a more recent version of the module. So >>>>> that might not be the problem. >>>>> Maybe another problem in the kernel ? >>>>> >>>>> Any ideas how I could debug this ? >>>>> >>>>> Cheers, >>>>> Sébastien >>>>> >>>>> >>>>> On 29.04.2012 13:23, Sébastien Riccio wrote: >>>>>> Hi, >>>>>> >>>>>> I'm currently playing with XCP 1.5 beta on a Dell Blade >>>>>> infrastructure that is linked to redundent filers with 10gbits NICs. >>>>>> >>>>>> The redundent 10gigs interfaces are not managed by xapi (i've told >>>>>> xapi to forget them) and are configured like on any normal linux box. >>>>>> >>>>>> They are connected to the same 10gig switch where the filers are >>>>>> connected to. >>>>>> >>>>>> Interfaces on the filers: >>>>>> 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme II >>>>>> BCM57711 10-Gigabit PCIe >>>>>> 03:00.1 Ethernet controller: Broadcom Corporation NetXtreme II >>>>>> BCM57711 10-Gigabit PCIe >>>>>> 04:00.0 Ethernet controller: Broadcom Corporation NetXtreme II >>>>>> BCM57711 10-Gigabit PCIe >>>>>> 04:00.1 Ethernet controller: Broadcom Corporation NetXtreme II >>>>>> BCM57711 10-Gigabit PCIe >>>>>> >>>>>> Interfaces on the blades: >>>>>> 05:00.0 Ethernet controller: Broadcom Corporation NetXtreme II >>>>>> BCM57711 10-Gigabit PCIe >>>>>> 05:00.1 Ethernet controller: Broadcom Corporation NetXtreme II >>>>>> BCM57711 10-Gigabit PCIe >>>>>> >>>>>> The problem is that after some testing with iperf i get to this >>>>>> conclusion: >>>>>> >>>>>> 1) Between the filers the speed is 9.9Gbit/s. >>>>>> 2) Between xcp 1.5 and both filer i get around 3Gbit/s which is >>>>>> quite poor >>>>>> 2) Between an xcp 1.1 and both filer i get around 3Gbit/s which is >>>>>> quite poor (so that's not a new 1.5 bug) >>>>>> 3) Between an xcp 1.5 and another xcp (1.1) i get too around 3Gbit/s. >>>>>> >>>>>> Do you have any ideas what could be the problem ? knowing that >>>>>> between the filers the speed is good and it's passing through the >>>>>> same switch. >>>>>> >>>>>> Is there a DDK VM available for XCP 1.5 beta? I would like to try >>>>>> to build the latest Broadcom drivers for these nics (bnx2x) for the >>>>>> kernel used on xcp 1.5... >>>>>> >>>>>> In the meantime I will try to install a debian on another blade to >>>>>> check if the speed are correct with the filers. >>>>>> >>>>>> Thanks, >>>>>> Sébastien >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Xen-api mailing list >>>>>> Xen-api [at] lists >>>>>> http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api >>>>>> >>>>> _______________________________________________ >>>>> Xen-api mailing list >>>>> Xen-api [at] lists >>>>> http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api >>>>> >>>> _______________________________________________ >>>> Xen-api mailing list >>>> Xen-api [at] lists >>>> http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api >>>> >>> _______________________________________________ >>> Xen-api mailing list >>> Xen-api [at] lists >>> http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api >>> >> _______________________________________________ >> Xen-api mailing list >> Xen-api [at] lists >> http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api >> > _______________________________________________ Xen-api mailing list Xen-api [at] lists http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api
|