Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Netapp: toasters

Poor NFS 10GbE performance on NetApp 6080s

 

 

Netapp toasters RSS feed   Index | Next | Previous | View Threaded


dburklan at NMDP

May 19, 2012, 10:48 AM

Post #1 of 23 (6428 views)
Permalink
Poor NFS 10GbE performance on NetApp 6080s

Hi all,

My company just bought some Intel x520 10GbE cards which I recently
installed into our Oracle EBS database servers (IBM 3850 X5s running RHEL
5.8). As the "linux guy" I have been tasked with getting these servers to
communicate with our NetApp 6080s via NFS over the new 10GbE links. I have
got everything working however ever after tuning the RHEL kernel I am only
getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile bs=1024
count=5242880" command. For you folks that run 10GbE to your toasters,
what write speeds are you seeing from your 10GbE connected servers? Did
you have to do any tuning in order to get the best results possible? If so
what did you change?

Thanks!

Dan



_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


rmcdermo at fhcrc

May 19, 2012, 11:08 AM

Post #2 of 23 (6357 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

Your block size is only 1K; try increasing the block size and the throughput will increase. 1K IOs would generate a lot of IOPs with very little throughput.

-Robert

Sent from my iPhone

On May 19, 2012, at 10:48, Dan Burkland <dburklan [at] NMDP> wrote:

> Hi all,
>
> My company just bought some Intel x520 10GbE cards which I recently
> installed into our Oracle EBS database servers (IBM 3850 X5s running RHEL
> 5.8). As the "linux guy" I have been tasked with getting these servers to
> communicate with our NetApp 6080s via NFS over the new 10GbE links. I have
> got everything working however ever after tuning the RHEL kernel I am only
> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile bs=1024
> count=5242880" command. For you folks that run 10GbE to your toasters,
> what write speeds are you seeing from your 10GbE connected servers? Did
> you have to do any tuning in order to get the best results possible? If so
> what did you change?
>
> Thanks!
>
> Dan
>
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


speedtoys.racing at gmail

May 19, 2012, 11:16 AM

Post #3 of 23 (6356 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

What happens if you run _2_ DD session??


DD/Copy/etc are not magic IO applications, they have single thread
performance limits.

No reason you cant

On Sat, May 19, 2012 at 1:48 PM, Dan Burkland <dburklan [at] nmdp> wrote:

> Hi all,
>
> My company just bought some Intel x520 10GbE cards which I recently
> installed into our Oracle EBS database servers (IBM 3850 X5s running RHEL
> 5.8). As the "linux guy" I have been tasked with getting these servers to
> communicate with our NetApp 6080s via NFS over the new 10GbE links. I have
> got everything working however ever after tuning the RHEL kernel I am only
> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile bs=1024
> count=5242880" command. For you folks that run 10GbE to your toasters,
> what write speeds are you seeing from your 10GbE connected servers? Did
> you have to do any tuning in order to get the best results possible? If so
> what did you change?
>
> Thanks!
>
> Dan
>
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters
>



--
---
Gustatus Similis Pullus


scl at virginia

May 19, 2012, 11:17 AM

Post #4 of 23 (6354 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

> Hi all,
>
> My company just bought some Intel x520 10GbE cards which I recently
> installed into our Oracle EBS database servers (IBM 3850 X5s running RHEL
> 5.8). As the "linux guy" I have been tasked with getting these servers to
> communicate with our NetApp 6080s via NFS over the new 10GbE links. I have
> got everything working however ever after tuning the RHEL kernel I am only
> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile bs=1024
> count=5242880" command. For you folks that run 10GbE to your toasters,
> what write speeds are you seeing from your 10GbE connected servers? Did
> you have to do any tuning in order to get the best results possible? If so
> what did you change?
>
> Thanks!
>
> Dan
>

Hi Dan,

Your test is a single process running a single thread. I suggest
running 10 dd jobs in parallel, writing to different files. And
as another guy suggested, also increase the block size, such as
bs=20480. That ought to drive up the total network throughput!


Steve Losen scl [at] virginia phone: 434-924-0640

University of Virginia ITC Unix Support


_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


speedtoys.racing at gmail

May 19, 2012, 11:42 AM

Post #5 of 23 (6353 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

Saturating 10Gbe on a 6080..is a feat. :)



On Sat, May 19, 2012 at 2:17 PM, Steve Losen <scl [at] virginia> wrote:

>
> > Hi all,
> >
> > My company just bought some Intel x520 10GbE cards which I recently
> > installed into our Oracle EBS database servers (IBM 3850 X5s running RHEL
> > 5.8). As the "linux guy" I have been tasked with getting these servers to
> > communicate with our NetApp 6080s via NFS over the new 10GbE links. I
> have
> > got everything working however ever after tuning the RHEL kernel I am
> only
> > getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
> bs=1024
> > count=5242880" command. For you folks that run 10GbE to your toasters,
> > what write speeds are you seeing from your 10GbE connected servers? Did
> > you have to do any tuning in order to get the best results possible? If
> so
> > what did you change?
> >
> > Thanks!
> >
> > Dan
> >
>
> Hi Dan,
>
> Your test is a single process running a single thread. I suggest
> running 10 dd jobs in parallel, writing to different files. And
> as another guy suggested, also increase the block size, such as
> bs=20480. That ought to drive up the total network throughput!
>
>
> Steve Losen scl [at] virginia phone: 434-924-0640
>
> University of Virginia ITC Unix Support
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters
>



--
---
Gustatus Similis Pullus


dburklan at NMDP

May 19, 2012, 11:46 AM

Post #6 of 23 (6366 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

I know dd isn't the best tool since it is a single threaded application
and in no way represents the workload that Oracle will impose. However, I
thought it would still give me a decent ballpark figure regarding
throughput. I tried a block size of 64k, 128k, and 1M (just to see) and
got a bit more promising results:

# dd if=/dev/zero of=/mnt/testfile bs=1M count=5120
5120+0 records in
5120+0 records out
5368709120 bytes (5.4 GB) copied, 26.6878 seconds, 201 MB/s

If I run two of these dd sessions at once the throughput figure above gets
cut in half (each dd session reports it creates the file at around
100MB/s).

As far as the switch goes, I have not checked it yet however I did notice
that flow control is set to full on the 6080 10GbE interfaces. We are also
running Jumbo Frames on all of the involved equipment.

As far as the RHEL OS tweaks go, here are the settings that I have changed
on the system:

###
/etc/sysctl.conf:

# 10GbE Kernel Parameters
net.core.rmem_default = 262144
net.core.rmem_max = 16777216
net.core.wmem_default = 262144
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 262144 16777216
net.ipv4.tcp_wmem = 4096 262144 16777216
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_syncookies = 0
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_sack = 0
#

###

###
/etc/modprobe.d/sunrpc.conf:


options sunrpc tcp_slot_table_entries=128

###


###
Mount options for the NetApp test NFS share:

rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sy
s

###

Thanks again for all of your quick and detailed responses!


Dan



On 5/19/12 1:08 PM, "Robert McDermott" <rmcdermo [at] fhcrc> wrote:

>Your block size is only 1K; try increasing the block size and the
>throughput will increase. 1K IOs would generate a lot of IOPs with very
>little throughput.
>
>-Robert
>
>Sent from my iPhone
>
>On May 19, 2012, at 10:48, Dan Burkland <dburklan [at] NMDP> wrote:
>
>> Hi all,
>>
>> My company just bought some Intel x520 10GbE cards which I recently
>> installed into our Oracle EBS database servers (IBM 3850 X5s running
>>RHEL
>> 5.8). As the "linux guy" I have been tasked with getting these servers
>>to
>> communicate with our NetApp 6080s via NFS over the new 10GbE links. I
>>have
>> got everything working however ever after tuning the RHEL kernel I am
>>only
>> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
>>bs=1024
>> count=5242880" command. For you folks that run 10GbE to your toasters,
>> what write speeds are you seeing from your 10GbE connected servers? Did
>> you have to do any tuning in order to get the best results possible? If
>>so
>> what did you change?
>>
>> Thanks!
>>
>> Dan
>>
>>
>>
>> _______________________________________________
>> Toasters mailing list
>> Toasters [at] teaparty
>> http://www.teaparty.net/mailman/listinfo/toasters


_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


dalvenjah at DAL

May 19, 2012, 11:50 AM

Post #7 of 23 (6463 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

Don't forget that in normal circumstances on Linux, you're funneling this NFS traffic through the single RPC channel and the single TCP connection to the NetApp, even if you use multiple mount points. (I can't wait for pNFS to be finalized and fully implemented.)

I've attached some sysctl tweaks we put on our high-NFS (non-Oracle) linux systems that may work (though please test, as your mileage may vary, and not all of these may be appropriate for your environment). They may not be appropriate for an Oracle box, too, so please use caution. The changes that are probably the most safe are raising the limits and raising the sunrpc table values; you probably don't want to modify the TCP settings without consulting your DBAs or Oracle support.

Note that the last two sysctls have be done after the sunrpc kernel module loads, but before the nfs module loads in order to take effect. You might have to throw those two into an init script to get it to occur in the right order.

You also might want to do some sequential throughput tests with iozone (test 0 and test 1 and the -t flag I think) and multiple (4, 8, or more) processes with larger (4k+) block sizes; but even so, what you're seeing may be the upper end for that sort of tool.

These tweaks help a bit, but at least in terms of Oracle, we've found that:

1) Even with normal Linux NFS, Oracle spawns enough threads that in general it will get better iops and throughput than most dd (sequential) or iozone test operations.
2) Oracle seems to have enough of a mini-io-subsystem that it gets better efficiencies than an everyday command like dd operating on a mount point
3) If you can get your DBAs to look into using Oracle DirectNFS, your setup will scream; DirectNFS establishes TCP connections straight between Oracle and the NetApp, uses multiple connections, and has its own caching and IO subsystem that Oracle knows about and will benefit from. When we tested on trunked-1GB links (which I know don't add up to n times 1GB bandwidth), we ended up saturating interfaces; from what I understand DirectIO will do even better on a 10GB network. You can also use dual-networks (like dual-fabric SAN) to have Oracle load-balance properly over multiple links, instead of doing normal LACP trunking and having it only be able to push a single 10GB link's worth of bandwidth. At that point your bottleneck should be the disks behind the NetApp.

Good luck, and remember, test on a dev system first!

-dalvenjah


# NFS tweaks here
# Raise generic socket memory useability, and start 'em big
net.core.rmem_default=524288
net.core.wmem_default=524288
net.core.rmem_max=16777216
net.core.wmem_max=16777216
# Raise tcp memory useability too
net.ipv4.tcp_rmem=4096 524288 16777216
net.ipv4.tcp_wmem=4096 524288 16777216
# raise the amount of memory for the fragmentation reassembly buffer
# (if it goes above high_thresh, kernel starts tossing packets until usage
# goes below low_thresh)
net.ipv4.ipfrag_high_thresh=524288
net.ipv4.ipfrag_low_thresh=393216
# turn off tcp timestamps (extra CPU hit) since this is likely a
# non-public server
net.ipv4.tcp_timestamps=0
# make sure window scaling is on
net.ipv4.tcp_window_scaling=1
# increase the number of option memory buffers
net.core.optmem_max=524287
# raise the max backlog of packets on a net device
net.core.netdev_max_backlog=2500
# max out the number of task request slots in the RPC code
sunrpc.tcp_slot_table_entries=128
sunrpc.udp_slot_table_entries=128

On May 19, 2012, at 10:48 AM, Dan Burkland wrote:

> Hi all,
>
> My company just bought some Intel x520 10GbE cards which I recently
> installed into our Oracle EBS database servers (IBM 3850 X5s running RHEL
> 5.8). As the "linux guy" I have been tasked with getting these servers to
> communicate with our NetApp 6080s via NFS over the new 10GbE links. I have
> got everything working however ever after tuning the RHEL kernel I am only
> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile bs=1024
> count=5242880" command. For you folks that run 10GbE to your toasters,
> what write speeds are you seeing from your 10GbE connected servers? Did
> you have to do any tuning in order to get the best results possible? If so
> what did you change?
>
> Thanks!
>
> Dan
>
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters


_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


dalvenjah at DAL

May 19, 2012, 12:01 PM

Post #8 of 23 (6373 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

One more thing -- I know this may spawn debate, and I didn't do it terribly scientifically, but a year or two ago we tested jumbo frames with 1GB links, between a pair of 6070s and a couple of boxes with Intel server NICs in them, and we actually found performance to get worse (though not very much) with jumbo frames than with normal 1500-byte frames. It certainly didn't improve performance.

My theory is that all the ASIC TCP checksum offloading and such is so optimized for 1500-byte packets, that when you get up to the 9000-byte frame size, things have to go back to software, and you don't end up with a speed boost.

I could be wrong, but you might want to test with 1500-byte MTU set on both ends, just to see.

-dalvenjah

On May 19, 2012, at 11:46 AM, Dan Burkland wrote:

> I know dd isn't the best tool since it is a single threaded application
> and in no way represents the workload that Oracle will impose. However, I
> thought it would still give me a decent ballpark figure regarding
> throughput. I tried a block size of 64k, 128k, and 1M (just to see) and
> got a bit more promising results:
>
> # dd if=/dev/zero of=/mnt/testfile bs=1M count=5120
> 5120+0 records in
> 5120+0 records out
> 5368709120 bytes (5.4 GB) copied, 26.6878 seconds, 201 MB/s
>
> If I run two of these dd sessions at once the throughput figure above gets
> cut in half (each dd session reports it creates the file at around
> 100MB/s).
>
> As far as the switch goes, I have not checked it yet however I did notice
> that flow control is set to full on the 6080 10GbE interfaces. We are also
> running Jumbo Frames on all of the involved equipment.
>
> As far as the RHEL OS tweaks go, here are the settings that I have changed
> on the system:
>
> ###
> /etc/sysctl.conf:
>
> # 10GbE Kernel Parameters
> net.core.rmem_default = 262144
> net.core.rmem_max = 16777216
> net.core.wmem_default = 262144
> net.core.wmem_max = 16777216
> net.ipv4.tcp_rmem = 4096 262144 16777216
> net.ipv4.tcp_wmem = 4096 262144 16777216
> net.ipv4.tcp_window_scaling = 1
> net.ipv4.tcp_syncookies = 0
> net.ipv4.tcp_timestamps = 0
> net.ipv4.tcp_sack = 0
> #
>
> ###
>
> ###
> /etc/modprobe.d/sunrpc.conf:
>
>
> options sunrpc tcp_slot_table_entries=128
>
> ###
>
>
> ###
> Mount options for the NetApp test NFS share:
>
> rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sy
> s
>
> ###
>
> Thanks again for all of your quick and detailed responses!
>
>
> Dan
>
>
>
> On 5/19/12 1:08 PM, "Robert McDermott" <rmcdermo [at] fhcrc> wrote:
>
>> Your block size is only 1K; try increasing the block size and the
>> throughput will increase. 1K IOs would generate a lot of IOPs with very
>> little throughput.
>>
>> -Robert
>>
>> Sent from my iPhone
>>
>> On May 19, 2012, at 10:48, Dan Burkland <dburklan [at] NMDP> wrote:
>>
>>> Hi all,
>>>
>>> My company just bought some Intel x520 10GbE cards which I recently
>>> installed into our Oracle EBS database servers (IBM 3850 X5s running
>>> RHEL
>>> 5.8). As the "linux guy" I have been tasked with getting these servers
>>> to
>>> communicate with our NetApp 6080s via NFS over the new 10GbE links. I
>>> have
>>> got everything working however ever after tuning the RHEL kernel I am
>>> only
>>> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
>>> bs=1024
>>> count=5242880" command. For you folks that run 10GbE to your toasters,
>>> what write speeds are you seeing from your 10GbE connected servers? Did
>>> you have to do any tuning in order to get the best results possible? If
>>> so
>>> what did you change?
>>>
>>> Thanks!
>>>
>>> Dan
>>>
>>>
>>>
>>> _______________________________________________
>>> Toasters mailing list
>>> Toasters [at] teaparty
>>> http://www.teaparty.net/mailman/listinfo/toasters
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters


_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


dburklan at NMDP

May 19, 2012, 12:14 PM

Post #9 of 23 (6449 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

Good to know, I spent most of my day testing yesterday and noticed that
Jumbo frames helped a little bit but not a whole lot. Here are my results:

1) Equipment & Lab Environment Configuration
* NFS Volume = 1 x 25GB NFS share (aggr1) on 6080 #1 (oplocks disabled)
* Mounted to test server with the RHEL default mount options which are:
*
(rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=s
ys)
* Volume was mounted to "/mnt" on the test server

* Server Hardware = 1 x 3850 X5 with the following specs:
* HOSTNAME: testserver
* CPU: 2 x Intel E7540 @ 2.00GHz
* RAM: 128GB
* Local HDD: 2x146GB (RAID1)
* 1GbE NIC: Intel Quad-port 1GbE 82580
* 10GbE NIC: Intel 10GbE x520 SFP+ PCIe 2.0 card

* Server OS
* Version: RHEL 5.7+ with the "kernel-2.6.18-308.4.1.el5" (5.8) kernel -
required by the Intel 10GbE NIC
* Each NIC had two active ports which were combined to form an
active/passive bond known as "bond1"


3) testserver to 6080 #1 Network Throughput Test with 5GB file creation
(dd if=/dev/zero of=/mnt/testfile bs=1024 count=5242880) * 1GbE w/o Jumbo
Frames (old storage network that talks to 2x1GbE interfaces on 6080 #1) =
5368709120 bytes (5.4 GB) copied, 99.0798 seconds, 54.2 MB/s
* 1GbE w/o Jumbo Frames (new storage network that talks to 2x10GbE
interfaces on 6080 #1) = 5368709120 bytes (5.4 GB) copied, 70.8844
seconds, 75.7 MB/s
* 1GbE + Jumbo Frames = 5368709120 bytes (5.4 GB) copied, 67.1208 seconds,
80.0 MB/s
* 10GbE w/0 Jumbo Frames = 5368709120 bytes (5.4 GB) copied, 45.9469
seconds, 117 MB/s
* 10GbE + Jumbo Frames = 5368709120 bytes (5.4 GB) copied, 38.8961
seconds, 138 MB/s

4) testserver to 6080 #1 Network Throughput Test + RHEL OS tweaking with
5GB file creation (dd if=/dev/zero of=/mnt/testfile bs=1024 count=5242880)
* 10GbE + Jumbo Frames with kernel defaults plus:
* sunrpc kernel module parameter change = 5368709120 bytes (5.4 GB)
copied, 39.0274 seconds, 138 MB/s
* Echoed "options sunrpc tcp_slot_table_entries=128" to
"/etc/modprobe.d/sunrpc.conf" according to
https://access.redhat.com/knowledge/solutions/69275 & rebooted
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
* "net.core.rmem_default = 262144" = 5368709120 bytes (5.4 GB) copied,
39.0149 seconds, 138 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
* "net.core.rmem_max = 16777216" = 5368709120 bytes (5.4 GB) copied,
32.7018 seconds, 164 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
* "net.core.wmem_default = 262144" = 5368709120 bytes (5.4 GB) copied,
33.245 seconds, 161 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
* "net.core.wmem_max = 16777216" = 5368709120 bytes (5.4 GB) copied,
33.2526 seconds, 161 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
* "net.ipv4.tcp_rmem = 4096 262144 16777216" = 5368709120 bytes (5.4 GB)
copied, 35.2615 seconds, 152 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
* "net.ipv4.tcp_wmem = 4096 262144 16777216" = 5368709120 bytes (5.4 GB)
copied, 33.0321 seconds, 163 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
* "net.ipv4.tcp_window_scaling = 1" = 5368709120 bytes (5.4 GB) copied,
33.5698 seconds, 160 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
* "net.ipv4.tcp_syncookies = 0" = 5368709120 bytes (5.4 GB) copied,
32.7373 seconds, 164 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
* "net.ipv4.tcp_timestamps = 0" = 5368709120 bytes (5.4 GB) copied,
34.0019 seconds, 158 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
* "net.ipv4.tcp_sack = 0" = 5368709120 bytes (5.4 GB) copied, 35.3956
seconds, 152 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
* "cpuspeed" service stopped = 5368709120 bytes (5.4 GB) copied, 32.8168
seconds, 164 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
* "irqbalance" service stopped = 5368709120 bytes (5.4 GB) copied,
32.6841 seconds, 164 MB/s


Regarding the test server itself, nothing is running on it except for my
test processes (it is a test box that has the exact same hardware
configuration as the Oracle EBS DB servers)

Regards,

Dan Burkland



On 5/19/12 2:01 PM, "Dalvenjah FoxFire" <dalvenjah [at] DAL> wrote:

>One more thing -- I know this may spawn debate, and I didn't do it
>terribly scientifically, but a year or two ago we tested jumbo frames
>with 1GB links, between a pair of 6070s and a couple of boxes with Intel
>server NICs in them, and we actually found performance to get worse
>(though not very much) with jumbo frames than with normal 1500-byte
>frames. It certainly didn't improve performance.
>
>My theory is that all the ASIC TCP checksum offloading and such is so
>optimized for 1500-byte packets, that when you get up to the 9000-byte
>frame size, things have to go back to software, and you don't end up with
>a speed boost.
>
>I could be wrong, but you might want to test with 1500-byte MTU set on
>both ends, just to see.
>
>-dalvenjah
>
>On May 19, 2012, at 11:46 AM, Dan Burkland wrote:
>
>> I know dd isn't the best tool since it is a single threaded application
>> and in no way represents the workload that Oracle will impose. However,
>>I
>> thought it would still give me a decent ballpark figure regarding
>> throughput. I tried a block size of 64k, 128k, and 1M (just to see) and
>> got a bit more promising results:
>>
>> # dd if=/dev/zero of=/mnt/testfile bs=1M count=5120
>> 5120+0 records in
>> 5120+0 records out
>> 5368709120 bytes (5.4 GB) copied, 26.6878 seconds, 201 MB/s
>>
>> If I run two of these dd sessions at once the throughput figure above
>>gets
>> cut in half (each dd session reports it creates the file at around
>> 100MB/s).
>>
>> As far as the switch goes, I have not checked it yet however I did
>>notice
>> that flow control is set to full on the 6080 10GbE interfaces. We are
>>also
>> running Jumbo Frames on all of the involved equipment.
>>
>> As far as the RHEL OS tweaks go, here are the settings that I have
>>changed
>> on the system:
>>
>> ###
>> /etc/sysctl.conf:
>>
>> # 10GbE Kernel Parameters
>> net.core.rmem_default = 262144
>> net.core.rmem_max = 16777216
>> net.core.wmem_default = 262144
>> net.core.wmem_max = 16777216
>> net.ipv4.tcp_rmem = 4096 262144 16777216
>> net.ipv4.tcp_wmem = 4096 262144 16777216
>> net.ipv4.tcp_window_scaling = 1
>> net.ipv4.tcp_syncookies = 0
>> net.ipv4.tcp_timestamps = 0
>> net.ipv4.tcp_sack = 0
>> #
>>
>> ###
>>
>> ###
>> /etc/modprobe.d/sunrpc.conf:
>>
>>
>> options sunrpc tcp_slot_table_entries=128
>>
>> ###
>>
>>
>> ###
>> Mount options for the NetApp test NFS share:
>>
>>
>>rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=
>>sy
>> s
>>
>> ###
>>
>> Thanks again for all of your quick and detailed responses!
>>
>>
>> Dan
>>
>>
>>
>> On 5/19/12 1:08 PM, "Robert McDermott" <rmcdermo [at] fhcrc> wrote:
>>
>>> Your block size is only 1K; try increasing the block size and the
>>> throughput will increase. 1K IOs would generate a lot of IOPs with very
>>> little throughput.
>>>
>>> -Robert
>>>
>>> Sent from my iPhone
>>>
>>> On May 19, 2012, at 10:48, Dan Burkland <dburklan [at] NMDP> wrote:
>>>
>>>> Hi all,
>>>>
>>>> My company just bought some Intel x520 10GbE cards which I recently
>>>> installed into our Oracle EBS database servers (IBM 3850 X5s running
>>>> RHEL
>>>> 5.8). As the "linux guy" I have been tasked with getting these servers
>>>> to
>>>> communicate with our NetApp 6080s via NFS over the new 10GbE links. I
>>>> have
>>>> got everything working however ever after tuning the RHEL kernel I am
>>>> only
>>>> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
>>>> bs=1024
>>>> count=5242880" command. For you folks that run 10GbE to your toasters,
>>>> what write speeds are you seeing from your 10GbE connected servers?
>>>>Did
>>>> you have to do any tuning in order to get the best results possible?
>>>>If
>>>> so
>>>> what did you change?
>>>>
>>>> Thanks!
>>>>
>>>> Dan
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Toasters mailing list
>>>> Toasters [at] teaparty
>>>> http://www.teaparty.net/mailman/listinfo/toasters
>>
>>
>> _______________________________________________
>> Toasters mailing list
>> Toasters [at] teaparty
>> http://www.teaparty.net/mailman/listinfo/toasters
>


_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


ag at anexia

May 19, 2012, 12:14 PM

Post #10 of 23 (6363 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

BTDT :-)

980MB/sec write on a full rack of SAS drives

Am 19.05.2012 um 20:43 schrieb "Jeff Mohler" <speedtoys.racing [at] gmail>:

> Saturating 10Gbe on a 6080..is a feat. :)
>
>
>
> On Sat, May 19, 2012 at 2:17 PM, Steve Losen <scl [at] virginia> wrote:
>
> > Hi all,
> >
> > My company just bought some Intel x520 10GbE cards which I recently
> > installed into our Oracle EBS database servers (IBM 3850 X5s running RHEL
> > 5.8). As the "linux guy" I have been tasked with getting these servers to
> > communicate with our NetApp 6080s via NFS over the new 10GbE links. I have
> > got everything working however ever after tuning the RHEL kernel I am only
> > getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile bs=1024
> > count=5242880" command. For you folks that run 10GbE to your toasters,
> > what write speeds are you seeing from your 10GbE connected servers? Did
> > you have to do any tuning in order to get the best results possible? If so
> > what did you change?
> >
> > Thanks!
> >
> > Dan
> >
>
> Hi Dan,
>
> Your test is a single process running a single thread. I suggest
> running 10 dd jobs in parallel, writing to different files. And
> as another guy suggested, also increase the block size, such as
> bs=20480. That ought to drive up the total network throughput!
>
>
> Steve Losen scl [at] virginia phone: 434-924-0640
>
> University of Virginia ITC Unix Support
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters
>
>
>
> --
> ---
> Gustatus Similis Pullus
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters


speedtoys.racing at gmail

May 19, 2012, 12:44 PM

Post #11 of 23 (6346 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

Easy one.

If it went down in half, adjust your kernel tcp slot count.



Sent from my iPhone

On May 19, 2012, at 11:46 AM, Dan Burkland <dburklan [at] NMDP> wrote:

> I know dd isn't the best tool since it is a single threaded application
> and in no way represents the workload that Oracle will impose. However, I
> thought it would still give me a decent ballpark figure regarding
> throughput. I tried a block size of 64k, 128k, and 1M (just to see) and
> got a bit more promising results:
>
> # dd if=/dev/zero of=/mnt/testfile bs=1M count=5120
> 5120+0 records in
> 5120+0 records out
> 5368709120 bytes (5.4 GB) copied, 26.6878 seconds, 201 MB/s
>
> If I run two of these dd sessions at once the throughput figure above gets
> cut in half (each dd session reports it creates the file at around
> 100MB/s).
>
> As far as the switch goes, I have not checked it yet however I did notice
> that flow control is set to full on the 6080 10GbE interfaces. We are also
> running Jumbo Frames on all of the involved equipment.
>
> As far as the RHEL OS tweaks go, here are the settings that I have changed
> on the system:
>
> ###
> /etc/sysctl.conf:
>
> # 10GbE Kernel Parameters
> net.core.rmem_default = 262144
> net.core.rmem_max = 16777216
> net.core.wmem_default = 262144
> net.core.wmem_max = 16777216
> net.ipv4.tcp_rmem = 4096 262144 16777216
> net.ipv4.tcp_wmem = 4096 262144 16777216
> net.ipv4.tcp_window_scaling = 1
> net.ipv4.tcp_syncookies = 0
> net.ipv4.tcp_timestamps = 0
> net.ipv4.tcp_sack = 0
> #
>
> ###
>
> ###
> /etc/modprobe.d/sunrpc.conf:
>
>
> options sunrpc tcp_slot_table_entries=128
>
> ###
>
>
> ###
> Mount options for the NetApp test NFS share:
>
> rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=sy
> s
>
> ###
>
> Thanks again for all of your quick and detailed responses!
>
>
> Dan
>
>
>
> On 5/19/12 1:08 PM, "Robert McDermott" <rmcdermo [at] fhcrc> wrote:
>
>> Your block size is only 1K; try increasing the block size and the
>> throughput will increase. 1K IOs would generate a lot of IOPs with very
>> little throughput.
>>
>> -Robert
>>
>> Sent from my iPhone
>>
>> On May 19, 2012, at 10:48, Dan Burkland <dburklan [at] NMDP> wrote:
>>
>>> Hi all,
>>>
>>> My company just bought some Intel x520 10GbE cards which I recently
>>> installed into our Oracle EBS database servers (IBM 3850 X5s running
>>> RHEL
>>> 5.8). As the "linux guy" I have been tasked with getting these servers
>>> to
>>> communicate with our NetApp 6080s via NFS over the new 10GbE links. I
>>> have
>>> got everything working however ever after tuning the RHEL kernel I am
>>> only
>>> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
>>> bs=1024
>>> count=5242880" command. For you folks that run 10GbE to your toasters,
>>> what write speeds are you seeing from your 10GbE connected servers? Did
>>> you have to do any tuning in order to get the best results possible? If
>>> so
>>> what did you change?
>>>
>>> Thanks!
>>>
>>> Dan
>>>
>>>
>>>
>>> _______________________________________________
>>> Toasters mailing list
>>> Toasters [at] teaparty
>>> http://www.teaparty.net/mailman/listinfo/toasters
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


andrey.borzenkov at ts

May 19, 2012, 1:16 PM

Post #12 of 23 (6365 views)
Permalink
RE: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

>
> Note that the last two sysctls have be done after the sunrpc kernel
> module loads, but before the nfs module loads in order to take effect.
> You might have to throw those two into an init script to get it to
> occur in the right order.
[...]
> # max out the number of task request slots in the RPC code
> sunrpc.tcp_slot_table_entries=128
> sunrpc.udp_slot_table_entries=128
>

Are you sure about it? We have always used it in NFS-root environment where full NFS stack is loaded from within initrd before /etc/sysctl.conf gets chance to be processed. AFAIR it must be set before file system is mounted, as it is per-mounted filesystem parameter.

BTW we also use multiple mount points even though they physically point to the same exported volume. Although this is more relevant for high throughput environment, not for single threaded app.

_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


dburklan at NMDP

May 19, 2012, 2:36 PM

Post #13 of 23 (6340 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

Jeff Mother - Which specific setting are you referring to?

I installed iozone on my test machine and am currently running the
following iozone command on it:

iozone -s 5g -r 1m -t 16 -R -b /root/iozone_testserver_2012-05-d.csv -F
tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18

I'll post the results once it is finished

Dan



On 5/19/12 2:44 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:

>Easy one.
>
>If it went down in half, adjust your kernel tcp slot count.
>
>
>
>Sent from my iPhone
>
>On May 19, 2012, at 11:46 AM, Dan Burkland <dburklan [at] NMDP> wrote:
>
>> I know dd isn't the best tool since it is a single threaded application
>> and in no way represents the workload that Oracle will impose. However,
>>I
>> thought it would still give me a decent ballpark figure regarding
>> throughput. I tried a block size of 64k, 128k, and 1M (just to see) and
>> got a bit more promising results:
>>
>> # dd if=/dev/zero of=/mnt/testfile bs=1M count=5120
>> 5120+0 records in
>> 5120+0 records out
>> 5368709120 bytes (5.4 GB) copied, 26.6878 seconds, 201 MB/s
>>
>> If I run two of these dd sessions at once the throughput figure above
>>gets
>> cut in half (each dd session reports it creates the file at around
>> 100MB/s).
>>
>> As far as the switch goes, I have not checked it yet however I did
>>notice
>> that flow control is set to full on the 6080 10GbE interfaces. We are
>>also
>> running Jumbo Frames on all of the involved equipment.
>>
>> As far as the RHEL OS tweaks go, here are the settings that I have
>>changed
>> on the system:
>>
>> ###
>> /etc/sysctl.conf:
>>
>> # 10GbE Kernel Parameters
>> net.core.rmem_default = 262144
>> net.core.rmem_max = 16777216
>> net.core.wmem_default = 262144
>> net.core.wmem_max = 16777216
>> net.ipv4.tcp_rmem = 4096 262144 16777216
>> net.ipv4.tcp_wmem = 4096 262144 16777216
>> net.ipv4.tcp_window_scaling = 1
>> net.ipv4.tcp_syncookies = 0
>> net.ipv4.tcp_timestamps = 0
>> net.ipv4.tcp_sack = 0
>> #
>>
>> ###
>>
>> ###
>> /etc/modprobe.d/sunrpc.conf:
>>
>>
>> options sunrpc tcp_slot_table_entries=128
>>
>> ###
>>
>>
>> ###
>> Mount options for the NetApp test NFS share:
>>
>>
>>rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=
>>sy
>> s
>>
>> ###
>>
>> Thanks again for all of your quick and detailed responses!
>>
>>
>> Dan
>>
>>
>>
>> On 5/19/12 1:08 PM, "Robert McDermott" <rmcdermo [at] fhcrc> wrote:
>>
>>> Your block size is only 1K; try increasing the block size and the
>>> throughput will increase. 1K IOs would generate a lot of IOPs with very
>>> little throughput.
>>>
>>> -Robert
>>>
>>> Sent from my iPhone
>>>
>>> On May 19, 2012, at 10:48, Dan Burkland <dburklan [at] NMDP> wrote:
>>>
>>>> Hi all,
>>>>
>>>> My company just bought some Intel x520 10GbE cards which I recently
>>>> installed into our Oracle EBS database servers (IBM 3850 X5s running
>>>> RHEL
>>>> 5.8). As the "linux guy" I have been tasked with getting these servers
>>>> to
>>>> communicate with our NetApp 6080s via NFS over the new 10GbE links. I
>>>> have
>>>> got everything working however ever after tuning the RHEL kernel I am
>>>> only
>>>> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
>>>> bs=1024
>>>> count=5242880" command. For you folks that run 10GbE to your toasters,
>>>> what write speeds are you seeing from your 10GbE connected servers?
>>>>Did
>>>> you have to do any tuning in order to get the best results possible?
>>>>If
>>>> so
>>>> what did you change?
>>>>
>>>> Thanks!
>>>>
>>>> Dan
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Toasters mailing list
>>>> Toasters [at] teaparty
>>>> http://www.teaparty.net/mailman/listinfo/toasters
>>
>>
>> _______________________________________________
>> Toasters mailing list
>> Toasters [at] teaparty
>> http://www.teaparty.net/mailman/listinfo/toasters


_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


dburklan at NMDP

May 19, 2012, 2:43 PM

Post #14 of 23 (6370 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

Here are the IOZone results:

Run began: Sat May 19 16:22:46 2012

File size set to 5242880 KB
Record Size 1024 KB
Excel chart generation enabled
Command line used: iozone -s 5g -r 1m -t 16 -R -b
/root/iozone_mn4s31063_2012-05-d.csv -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
t11 t12 t13 t14 t15 t16 t17 t18
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Throughput test with 16 processes
Each process writes a 5242880 Kbyte file in 1024 Kbyte records

Children see throughput for 16 initial writers = 371306.91 KB/sec
Parent sees throughput for 16 initial writers = 167971.82 KB/sec
Min throughput per process = 21901.84 KB/sec
Max throughput per process = 25333.62 KB/sec
Avg throughput per process = 23206.68 KB/sec
Min xfer = 4533248.00 KB

Children see throughput for 16 rewriters = 350486.11 KB/sec
Parent sees throughput for 16 rewriters = 176947.47 KB/sec
Min throughput per process = 21154.26 KB/sec
Max throughput per process = 23011.69 KB/sec
Avg throughput per process = 21905.38 KB/sec
Min xfer = 4819968.00 KB

362MB/s looks quite a bit higher however can somebody validate that I am
reading these results correctly? Should I also run "iozone" with the -a
(auto) option for good measure?

Thanks again for all of your responses, I greatly appreciate it!


Dan


On 5/19/12 4:36 PM, "Dan Burkland" <dburklan [at] NMDP> wrote:

>Jeff Mother - Which specific setting are you referring to?
>
>I installed iozone on my test machine and am currently running the
>following iozone command on it:
>
>iozone -s 5g -r 1m -t 16 -R -b /root/iozone_testserver_2012-05-d.csv -F
>tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18
>
>I'll post the results once it is finished
>
>Dan
>
>
>
>On 5/19/12 2:44 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:
>
>>Easy one.
>>
>>If it went down in half, adjust your kernel tcp slot count.
>>
>>
>>
>>Sent from my iPhone
>>
>>On May 19, 2012, at 11:46 AM, Dan Burkland <dburklan [at] NMDP> wrote:
>>
>>> I know dd isn't the best tool since it is a single threaded application
>>> and in no way represents the workload that Oracle will impose. However,
>>>I
>>> thought it would still give me a decent ballpark figure regarding
>>> throughput. I tried a block size of 64k, 128k, and 1M (just to see) and
>>> got a bit more promising results:
>>>
>>> # dd if=/dev/zero of=/mnt/testfile bs=1M count=5120
>>> 5120+0 records in
>>> 5120+0 records out
>>> 5368709120 bytes (5.4 GB) copied, 26.6878 seconds, 201 MB/s
>>>
>>> If I run two of these dd sessions at once the throughput figure above
>>>gets
>>> cut in half (each dd session reports it creates the file at around
>>> 100MB/s).
>>>
>>> As far as the switch goes, I have not checked it yet however I did
>>>notice
>>> that flow control is set to full on the 6080 10GbE interfaces. We are
>>>also
>>> running Jumbo Frames on all of the involved equipment.
>>>
>>> As far as the RHEL OS tweaks go, here are the settings that I have
>>>changed
>>> on the system:
>>>
>>> ###
>>> /etc/sysctl.conf:
>>>
>>> # 10GbE Kernel Parameters
>>> net.core.rmem_default = 262144
>>> net.core.rmem_max = 16777216
>>> net.core.wmem_default = 262144
>>> net.core.wmem_max = 16777216
>>> net.ipv4.tcp_rmem = 4096 262144 16777216
>>> net.ipv4.tcp_wmem = 4096 262144 16777216
>>> net.ipv4.tcp_window_scaling = 1
>>> net.ipv4.tcp_syncookies = 0
>>> net.ipv4.tcp_timestamps = 0
>>> net.ipv4.tcp_sack = 0
>>> #
>>>
>>> ###
>>>
>>> ###
>>> /etc/modprobe.d/sunrpc.conf:
>>>
>>>
>>> options sunrpc tcp_slot_table_entries=128
>>>
>>> ###
>>>
>>>
>>> ###
>>> Mount options for the NetApp test NFS share:
>>>
>>>
>>>rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec
>>>=
>>>sy
>>> s
>>>
>>> ###
>>>
>>> Thanks again for all of your quick and detailed responses!
>>>
>>>
>>> Dan
>>>
>>>
>>>
>>> On 5/19/12 1:08 PM, "Robert McDermott" <rmcdermo [at] fhcrc> wrote:
>>>
>>>> Your block size is only 1K; try increasing the block size and the
>>>> throughput will increase. 1K IOs would generate a lot of IOPs with
>>>>very
>>>> little throughput.
>>>>
>>>> -Robert
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On May 19, 2012, at 10:48, Dan Burkland <dburklan [at] NMDP> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> My company just bought some Intel x520 10GbE cards which I recently
>>>>> installed into our Oracle EBS database servers (IBM 3850 X5s running
>>>>> RHEL
>>>>> 5.8). As the "linux guy" I have been tasked with getting these
>>>>>servers
>>>>> to
>>>>> communicate with our NetApp 6080s via NFS over the new 10GbE links. I
>>>>> have
>>>>> got everything working however ever after tuning the RHEL kernel I am
>>>>> only
>>>>> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
>>>>> bs=1024
>>>>> count=5242880" command. For you folks that run 10GbE to your
>>>>>toasters,
>>>>> what write speeds are you seeing from your 10GbE connected servers?
>>>>>Did
>>>>> you have to do any tuning in order to get the best results possible?
>>>>>If
>>>>> so
>>>>> what did you change?
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Toasters mailing list
>>>>> Toasters [at] teaparty
>>>>> http://www.teaparty.net/mailman/listinfo/toasters
>>>
>>>
>>> _______________________________________________
>>> Toasters mailing list
>>> Toasters [at] teaparty
>>> http://www.teaparty.net/mailman/listinfo/toasters
>


_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


speedtoys.racing at gmail

May 19, 2012, 2:45 PM

Post #15 of 23 (6349 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

Check Andre's email here.

Sent from my iPhone

On May 19, 2012, at 2:36 PM, Dan Burkland <dburklan [at] NMDP> wrote:

> Jeff Mother - Which specific setting are you referring to?
>
> I installed iozone on my test machine and am currently running the
> following iozone command on it:
>
> iozone -s 5g -r 1m -t 16 -R -b /root/iozone_testserver_2012-05-d.csv -F
> tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18
>
> I'll post the results once it is finished
>
> Dan
>
>
>
> On 5/19/12 2:44 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:
>
>> Easy one.
>>
>> If it went down in half, adjust your kernel tcp slot count.
>>
>>
>>
>> Sent from my iPhone
>>
>> On May 19, 2012, at 11:46 AM, Dan Burkland <dburklan [at] NMDP> wrote:
>>
>>> I know dd isn't the best tool since it is a single threaded application
>>> and in no way represents the workload that Oracle will impose. However,
>>> I
>>> thought it would still give me a decent ballpark figure regarding
>>> throughput. I tried a block size of 64k, 128k, and 1M (just to see) and
>>> got a bit more promising results:
>>>
>>> # dd if=/dev/zero of=/mnt/testfile bs=1M count=5120
>>> 5120+0 records in
>>> 5120+0 records out
>>> 5368709120 bytes (5.4 GB) copied, 26.6878 seconds, 201 MB/s
>>>
>>> If I run two of these dd sessions at once the throughput figure above
>>> gets
>>> cut in half (each dd session reports it creates the file at around
>>> 100MB/s).
>>>
>>> As far as the switch goes, I have not checked it yet however I did
>>> notice
>>> that flow control is set to full on the 6080 10GbE interfaces. We are
>>> also
>>> running Jumbo Frames on all of the involved equipment.
>>>
>>> As far as the RHEL OS tweaks go, here are the settings that I have
>>> changed
>>> on the system:
>>>
>>> ###
>>> /etc/sysctl.conf:
>>>
>>> # 10GbE Kernel Parameters
>>> net.core.rmem_default = 262144
>>> net.core.rmem_max = 16777216
>>> net.core.wmem_default = 262144
>>> net.core.wmem_max = 16777216
>>> net.ipv4.tcp_rmem = 4096 262144 16777216
>>> net.ipv4.tcp_wmem = 4096 262144 16777216
>>> net.ipv4.tcp_window_scaling = 1
>>> net.ipv4.tcp_syncookies = 0
>>> net.ipv4.tcp_timestamps = 0
>>> net.ipv4.tcp_sack = 0
>>> #
>>>
>>> ###
>>>
>>> ###
>>> /etc/modprobe.d/sunrpc.conf:
>>>
>>>
>>> options sunrpc tcp_slot_table_entries=128
>>>
>>> ###
>>>
>>>
>>> ###
>>> Mount options for the NetApp test NFS share:
>>>
>>>
>>> rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=
>>> sy
>>> s
>>>
>>> ###
>>>
>>> Thanks again for all of your quick and detailed responses!
>>>
>>>
>>> Dan
>>>
>>>
>>>
>>> On 5/19/12 1:08 PM, "Robert McDermott" <rmcdermo [at] fhcrc> wrote:
>>>
>>>> Your block size is only 1K; try increasing the block size and the
>>>> throughput will increase. 1K IOs would generate a lot of IOPs with very
>>>> little throughput.
>>>>
>>>> -Robert
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On May 19, 2012, at 10:48, Dan Burkland <dburklan [at] NMDP> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> My company just bought some Intel x520 10GbE cards which I recently
>>>>> installed into our Oracle EBS database servers (IBM 3850 X5s running
>>>>> RHEL
>>>>> 5.8). As the "linux guy" I have been tasked with getting these servers
>>>>> to
>>>>> communicate with our NetApp 6080s via NFS over the new 10GbE links. I
>>>>> have
>>>>> got everything working however ever after tuning the RHEL kernel I am
>>>>> only
>>>>> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
>>>>> bs=1024
>>>>> count=5242880" command. For you folks that run 10GbE to your toasters,
>>>>> what write speeds are you seeing from your 10GbE connected servers?
>>>>> Did
>>>>> you have to do any tuning in order to get the best results possible?
>>>>> If
>>>>> so
>>>>> what did you change?
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Toasters mailing list
>>>>> Toasters [at] teaparty
>>>>> http://www.teaparty.net/mailman/listinfo/toasters
>>>
>>>
>>> _______________________________________________
>>> Toasters mailing list
>>> Toasters [at] teaparty
>>> http://www.teaparty.net/mailman/listinfo/toasters
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


speedtoys.racing at gmail

May 19, 2012, 2:48 PM

Post #16 of 23 (6336 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

You're now approaching storage write saturation for your box on writes at that rate.

Pull reads now.



Sent from my iPhone

On May 19, 2012, at 2:43 PM, Dan Burkland <dburklan [at] NMDP> wrote:

> Here are the IOZone results:
>
> Run began: Sat May 19 16:22:46 2012
>
> File size set to 5242880 KB
> Record Size 1024 KB
> Excel chart generation enabled
> Command line used: iozone -s 5g -r 1m -t 16 -R -b
> /root/iozone_mn4s31063_2012-05-d.csv -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
> t11 t12 t13 t14 t15 t16 t17 t18
> Output is in Kbytes/sec
> Time Resolution = 0.000001 seconds.
> Processor cache size set to 1024 Kbytes.
> Processor cache line size set to 32 bytes.
> File stride size set to 17 * record size.
> Throughput test with 16 processes
> Each process writes a 5242880 Kbyte file in 1024 Kbyte records
>
> Children see throughput for 16 initial writers = 371306.91 KB/sec
> Parent sees throughput for 16 initial writers = 167971.82 KB/sec
> Min throughput per process = 21901.84 KB/sec
> Max throughput per process = 25333.62 KB/sec
> Avg throughput per process = 23206.68 KB/sec
> Min xfer = 4533248.00 KB
>
> Children see throughput for 16 rewriters = 350486.11 KB/sec
> Parent sees throughput for 16 rewriters = 176947.47 KB/sec
> Min throughput per process = 21154.26 KB/sec
> Max throughput per process = 23011.69 KB/sec
> Avg throughput per process = 21905.38 KB/sec
> Min xfer = 4819968.00 KB
>
> 362MB/s looks quite a bit higher however can somebody validate that I am
> reading these results correctly? Should I also run "iozone" with the -a
> (auto) option for good measure?
>
> Thanks again for all of your responses, I greatly appreciate it!
>
>
> Dan
>
>
> On 5/19/12 4:36 PM, "Dan Burkland" <dburklan [at] NMDP> wrote:
>
>> Jeff Mother - Which specific setting are you referring to?
>>
>> I installed iozone on my test machine and am currently running the
>> following iozone command on it:
>>
>> iozone -s 5g -r 1m -t 16 -R -b /root/iozone_testserver_2012-05-d.csv -F
>> tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18
>>
>> I'll post the results once it is finished
>>
>> Dan
>>
>>
>>
>> On 5/19/12 2:44 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:
>>
>>> Easy one.
>>>
>>> If it went down in half, adjust your kernel tcp slot count.
>>>
>>>
>>>
>>> Sent from my iPhone
>>>
>>> On May 19, 2012, at 11:46 AM, Dan Burkland <dburklan [at] NMDP> wrote:
>>>
>>>> I know dd isn't the best tool since it is a single threaded application
>>>> and in no way represents the workload that Oracle will impose. However,
>>>> I
>>>> thought it would still give me a decent ballpark figure regarding
>>>> throughput. I tried a block size of 64k, 128k, and 1M (just to see) and
>>>> got a bit more promising results:
>>>>
>>>> # dd if=/dev/zero of=/mnt/testfile bs=1M count=5120
>>>> 5120+0 records in
>>>> 5120+0 records out
>>>> 5368709120 bytes (5.4 GB) copied, 26.6878 seconds, 201 MB/s
>>>>
>>>> If I run two of these dd sessions at once the throughput figure above
>>>> gets
>>>> cut in half (each dd session reports it creates the file at around
>>>> 100MB/s).
>>>>
>>>> As far as the switch goes, I have not checked it yet however I did
>>>> notice
>>>> that flow control is set to full on the 6080 10GbE interfaces. We are
>>>> also
>>>> running Jumbo Frames on all of the involved equipment.
>>>>
>>>> As far as the RHEL OS tweaks go, here are the settings that I have
>>>> changed
>>>> on the system:
>>>>
>>>> ###
>>>> /etc/sysctl.conf:
>>>>
>>>> # 10GbE Kernel Parameters
>>>> net.core.rmem_default = 262144
>>>> net.core.rmem_max = 16777216
>>>> net.core.wmem_default = 262144
>>>> net.core.wmem_max = 16777216
>>>> net.ipv4.tcp_rmem = 4096 262144 16777216
>>>> net.ipv4.tcp_wmem = 4096 262144 16777216
>>>> net.ipv4.tcp_window_scaling = 1
>>>> net.ipv4.tcp_syncookies = 0
>>>> net.ipv4.tcp_timestamps = 0
>>>> net.ipv4.tcp_sack = 0
>>>> #
>>>>
>>>> ###
>>>>
>>>> ###
>>>> /etc/modprobe.d/sunrpc.conf:
>>>>
>>>>
>>>> options sunrpc tcp_slot_table_entries=128
>>>>
>>>> ###
>>>>
>>>>
>>>> ###
>>>> Mount options for the NetApp test NFS share:
>>>>
>>>>
>>>> rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec
>>>> =
>>>> sy
>>>> s
>>>>
>>>> ###
>>>>
>>>> Thanks again for all of your quick and detailed responses!
>>>>
>>>>
>>>> Dan
>>>>
>>>>
>>>>
>>>> On 5/19/12 1:08 PM, "Robert McDermott" <rmcdermo [at] fhcrc> wrote:
>>>>
>>>>> Your block size is only 1K; try increasing the block size and the
>>>>> throughput will increase. 1K IOs would generate a lot of IOPs with
>>>>> very
>>>>> little throughput.
>>>>>
>>>>> -Robert
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>> On May 19, 2012, at 10:48, Dan Burkland <dburklan [at] NMDP> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> My company just bought some Intel x520 10GbE cards which I recently
>>>>>> installed into our Oracle EBS database servers (IBM 3850 X5s running
>>>>>> RHEL
>>>>>> 5.8). As the "linux guy" I have been tasked with getting these
>>>>>> servers
>>>>>> to
>>>>>> communicate with our NetApp 6080s via NFS over the new 10GbE links. I
>>>>>> have
>>>>>> got everything working however ever after tuning the RHEL kernel I am
>>>>>> only
>>>>>> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
>>>>>> bs=1024
>>>>>> count=5242880" command. For you folks that run 10GbE to your
>>>>>> toasters,
>>>>>> what write speeds are you seeing from your 10GbE connected servers?
>>>>>> Did
>>>>>> you have to do any tuning in order to get the best results possible?
>>>>>> If
>>>>>> so
>>>>>> what did you change?
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Toasters mailing list
>>>>>> Toasters [at] teaparty
>>>>>> http://www.teaparty.net/mailman/listinfo/toasters
>>>>
>>>>
>>>> _______________________________________________
>>>> Toasters mailing list
>>>> Toasters [at] teaparty
>>>> http://www.teaparty.net/mailman/listinfo/toasters
>>
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


dburklan at NMDP

May 19, 2012, 3:32 PM

Post #17 of 23 (6348 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

I unmounted the NFS share and rebooted the box before running the same
"iozone" command again. This time I let "iozone" run through all of its
test (including the read-based ones)


Run began: Sat May 19 16:46:27 2012

File size set to 5242880 KB
Record Size 1024 KB
Excel chart generation enabled
Command line used: iozone -s 5g -r 1m -t 16 -R -b
/root/iozone_mn4s31063_2012-05-d.csv -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
t11 t12 t13 t14 t15 t16 t17 t18
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Throughput test with 16 processes
Each process writes a 5242880 Kbyte file in 1024 Kbyte records

Children see throughput for 16 initial writers = 349500.55 KB/sec
Parent sees throughput for 16 initial writers = 173837.26 KB/sec
Min throughput per process = 21147.24 KB/sec
Max throughput per process = 22701.06 KB/sec
Avg throughput per process = 21843.78 KB/sec
Min xfer = 4884480.00 KB

Children see throughput for 16 rewriters = 372333.90 KB/sec
Parent sees throughput for 16 rewriters = 179256.38 KB/sec
Min throughput per process = 22495.20 KB/sec
Max throughput per process = 24418.89 KB/sec
Avg throughput per process = 23270.87 KB/sec
Min xfer = 4830208.00 KB

Children see throughput for 16 readers = 440115.98 KB/sec
Parent sees throughput for 16 readers = 439993.44 KB/sec
Min throughput per process = 26406.17 KB/sec
Max throughput per process = 28724.05 KB/sec
Avg throughput per process = 27507.25 KB/sec
Min xfer = 4819968.00 KB

Children see throughput for 16 re-readers = 8953522.06 KB/sec
Parent sees throughput for 16 re-readers = 8930475.33 KB/sec
Min throughput per process = 408033.34 KB/sec
Max throughput per process = 671821.62 KB/sec
Avg throughput per process = 559595.13 KB/sec
Min xfer = 3186688.00 KB

Children see throughput for 16 reverse readers = 5543829.37 KB/sec
Parent sees throughput for 16 reverse readers = 5425986.47 KB/sec
Min throughput per process = 15684.29 KB/sec
Max throughput per process = 2261884.25 KB/sec
Avg throughput per process = 346489.34 KB/sec
Min xfer = 36864.00 KB

Children see throughput for 16 stride readers = 16532117.19 KB/sec
Parent sees throughput for 16 stride readers = 16272131.55 KB/sec
Min throughput per process = 257097.92 KB/sec
Max throughput per process = 2256125.75 KB/sec
Avg throughput per process = 1033257.32 KB/sec
Min xfer = 602112.00 KB

Children see throughput for 16 random readers = 17297437.81 KB/sec
Parent sees throughput for 16 random readers = 16871312.92 KB/sec
Min throughput per process = 320909.25 KB/sec
Max throughput per process = 2083737.75 KB/sec
Avg throughput per process = 1081089.86 KB/sec
Min xfer = 826368.00 KB

Children see throughput for 16 mixed workload = 10747970.97 KB/sec
Parent sees throughput for 16 mixed workload = 112898.07 KB/sec
Min throughput per process = 54960.62 KB/sec
Max throughput per process = 1991637.38 KB/sec
Avg throughput per process = 671748.19 KB/sec
Min xfer = 145408.00 KB

Children see throughput for 16 random writers = 358103.29 KB/sec
Parent sees throughput for 16 random writers = 166805.09 KB/sec
Min throughput per process = 21263.60 KB/sec
Max throughput per process = 22942.70 KB/sec
Avg throughput per process = 22381.46 KB/sec
Min xfer = 4859904.00 KB

Children see throughput for 16 pwrite writers = 325666.64 KB/sec
Parent sees throughput for 16 pwrite writers = 177771.50 KB/sec
Min throughput per process = 19902.90 KB/sec
Max throughput per process = 20863.29 KB/sec
Avg throughput per process = 20354.17 KB/sec
Min xfer = 5008384.00 KB

Children see throughput for 16 pread readers = 445021.47 KB/sec
Parent sees throughput for 16 pread readers = 444618.25 KB/sec
Min throughput per process = 26932.47 KB/sec
Max throughput per process = 28361.61 KB/sec
Avg throughput per process = 27813.84 KB/sec
Min xfer = 4981760.00 KB



"Throughput report Y-axis is type of test X-axis is number of processes"
"Record size = 1024 Kbytes "
"Output is in Kbytes/sec"

" Initial write " 349500.55

" Rewrite " 372333.90

" Read " 440115.98

" Re-read " 8953522.06

" Reverse Read " 5543829.37

" Stride read " 16532117.19

" Random read " 17297437.81

" Mixed workload " 10747970.97

" Random write " 358103.29

" Pwrite " 325666.64

" Pread " 445021.47


Regards,

Dan


On 5/19/12 4:48 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:

>You're now approaching storage write saturation for your box on writes at
>that rate.
>
>Pull reads now.
>
>
>
>Sent from my iPhone
>
>On May 19, 2012, at 2:43 PM, Dan Burkland <dburklan [at] NMDP> wrote:
>
>> Here are the IOZone results:
>>
>> Run began: Sat May 19 16:22:46 2012
>>
>> File size set to 5242880 KB
>> Record Size 1024 KB
>> Excel chart generation enabled
>> Command line used: iozone -s 5g -r 1m -t 16 -R -b
>> /root/iozone_mn4s31063_2012-05-d.csv -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9
>>t10
>> t11 t12 t13 t14 t15 t16 t17 t18
>> Output is in Kbytes/sec
>> Time Resolution = 0.000001 seconds.
>> Processor cache size set to 1024 Kbytes.
>> Processor cache line size set to 32 bytes.
>> File stride size set to 17 * record size.
>> Throughput test with 16 processes
>> Each process writes a 5242880 Kbyte file in 1024 Kbyte records
>>
>> Children see throughput for 16 initial writers = 371306.91 KB/sec
>> Parent sees throughput for 16 initial writers = 167971.82 KB/sec
>> Min throughput per process = 21901.84 KB/sec
>> Max throughput per process = 25333.62 KB/sec
>> Avg throughput per process = 23206.68 KB/sec
>> Min xfer = 4533248.00 KB
>>
>> Children see throughput for 16 rewriters = 350486.11 KB/sec
>> Parent sees throughput for 16 rewriters = 176947.47 KB/sec
>> Min throughput per process = 21154.26 KB/sec
>> Max throughput per process = 23011.69 KB/sec
>> Avg throughput per process = 21905.38 KB/sec
>> Min xfer = 4819968.00 KB
>>
>> 362MB/s looks quite a bit higher however can somebody validate that I am
>> reading these results correctly? Should I also run "iozone" with the -a
>> (auto) option for good measure?
>>
>> Thanks again for all of your responses, I greatly appreciate it!
>>
>>
>> Dan
>>
>>
>> On 5/19/12 4:36 PM, "Dan Burkland" <dburklan [at] NMDP> wrote:
>>
>>> Jeff Mother - Which specific setting are you referring to?
>>>
>>> I installed iozone on my test machine and am currently running the
>>> following iozone command on it:
>>>
>>> iozone -s 5g -r 1m -t 16 -R -b /root/iozone_testserver_2012-05-d.csv -F
>>> tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18
>>>
>>> I'll post the results once it is finished
>>>
>>> Dan
>>>
>>>
>>>
>>> On 5/19/12 2:44 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:
>>>
>>>> Easy one.
>>>>
>>>> If it went down in half, adjust your kernel tcp slot count.
>>>>
>>>>
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On May 19, 2012, at 11:46 AM, Dan Burkland <dburklan [at] NMDP> wrote:
>>>>
>>>>> I know dd isn't the best tool since it is a single threaded
>>>>>application
>>>>> and in no way represents the workload that Oracle will impose.
>>>>>However,
>>>>> I
>>>>> thought it would still give me a decent ballpark figure regarding
>>>>> throughput. I tried a block size of 64k, 128k, and 1M (just to see)
>>>>>and
>>>>> got a bit more promising results:
>>>>>
>>>>> # dd if=/dev/zero of=/mnt/testfile bs=1M count=5120
>>>>> 5120+0 records in
>>>>> 5120+0 records out
>>>>> 5368709120 bytes (5.4 GB) copied, 26.6878 seconds, 201 MB/s
>>>>>
>>>>> If I run two of these dd sessions at once the throughput figure above
>>>>> gets
>>>>> cut in half (each dd session reports it creates the file at around
>>>>> 100MB/s).
>>>>>
>>>>> As far as the switch goes, I have not checked it yet however I did
>>>>> notice
>>>>> that flow control is set to full on the 6080 10GbE interfaces. We are
>>>>> also
>>>>> running Jumbo Frames on all of the involved equipment.
>>>>>
>>>>> As far as the RHEL OS tweaks go, here are the settings that I have
>>>>> changed
>>>>> on the system:
>>>>>
>>>>> ###
>>>>> /etc/sysctl.conf:
>>>>>
>>>>> # 10GbE Kernel Parameters
>>>>> net.core.rmem_default = 262144
>>>>> net.core.rmem_max = 16777216
>>>>> net.core.wmem_default = 262144
>>>>> net.core.wmem_max = 16777216
>>>>> net.ipv4.tcp_rmem = 4096 262144 16777216
>>>>> net.ipv4.tcp_wmem = 4096 262144 16777216
>>>>> net.ipv4.tcp_window_scaling = 1
>>>>> net.ipv4.tcp_syncookies = 0
>>>>> net.ipv4.tcp_timestamps = 0
>>>>> net.ipv4.tcp_sack = 0
>>>>> #
>>>>>
>>>>> ###
>>>>>
>>>>> ###
>>>>> /etc/modprobe.d/sunrpc.conf:
>>>>>
>>>>>
>>>>> options sunrpc tcp_slot_table_entries=128
>>>>>
>>>>> ###
>>>>>
>>>>>
>>>>> ###
>>>>> Mount options for the NetApp test NFS share:
>>>>>
>>>>>
>>>>>
>>>>>rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,s
>>>>>ec
>>>>> =
>>>>> sy
>>>>> s
>>>>>
>>>>> ###
>>>>>
>>>>> Thanks again for all of your quick and detailed responses!
>>>>>
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>
>>>>> On 5/19/12 1:08 PM, "Robert McDermott" <rmcdermo [at] fhcrc> wrote:
>>>>>
>>>>>> Your block size is only 1K; try increasing the block size and the
>>>>>> throughput will increase. 1K IOs would generate a lot of IOPs with
>>>>>> very
>>>>>> little throughput.
>>>>>>
>>>>>> -Robert
>>>>>>
>>>>>> Sent from my iPhone
>>>>>>
>>>>>> On May 19, 2012, at 10:48, Dan Burkland <dburklan [at] NMDP> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> My company just bought some Intel x520 10GbE cards which I recently
>>>>>>> installed into our Oracle EBS database servers (IBM 3850 X5s
>>>>>>>running
>>>>>>> RHEL
>>>>>>> 5.8). As the "linux guy" I have been tasked with getting these
>>>>>>> servers
>>>>>>> to
>>>>>>> communicate with our NetApp 6080s via NFS over the new 10GbE
>>>>>>>links. I
>>>>>>> have
>>>>>>> got everything working however ever after tuning the RHEL kernel I
>>>>>>>am
>>>>>>> only
>>>>>>> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
>>>>>>> bs=1024
>>>>>>> count=5242880" command. For you folks that run 10GbE to your
>>>>>>> toasters,
>>>>>>> what write speeds are you seeing from your 10GbE connected servers?
>>>>>>> Did
>>>>>>> you have to do any tuning in order to get the best results
>>>>>>>possible?
>>>>>>> If
>>>>>>> so
>>>>>>> what did you change?
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Toasters mailing list
>>>>>>> Toasters [at] teaparty
>>>>>>> http://www.teaparty.net/mailman/listinfo/toasters
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Toasters mailing list
>>>>> Toasters [at] teaparty
>>>>> http://www.teaparty.net/mailman/listinfo/toasters
>>>
>>
>>
>> _______________________________________________
>> Toasters mailing list
>> Toasters [at] teaparty
>> http://www.teaparty.net/mailman/listinfo/toasters


_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


Peter.Learmonth at netapp

May 20, 2012, 10:10 AM

Post #18 of 23 (6369 views)
Permalink
RE: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

If you want a simple, highly tunable IO generation tool, check out NetApp's own SIO in the NOW toolchest:
http://support.netapp.com/eservice/toolchest?toolid=418
http://www.netapp.com/go/techontap/tot-march2006/0306tot_monthlytoolSIO.html


If you're suspicious of using a vendor provided tool, the source is included in the download. ;-)

Share and enjoy!

Peter

-----Original Message-----
From: toasters-bounces [at] teaparty [mailto:toasters-bounces [at] teaparty] On Behalf Of Dan Burkland
Sent: Saturday, May 19, 2012 10:48 AM
To: toasters [at] teaparty
Subject: Poor NFS 10GbE performance on NetApp 6080s

Hi all,

My company just bought some Intel x520 10GbE cards which I recently
installed into our Oracle EBS database servers (IBM 3850 X5s running RHEL
5.8). As the "linux guy" I have been tasked with getting these servers to
communicate with our NetApp 6080s via NFS over the new 10GbE links. I have
got everything working however ever after tuning the RHEL kernel I am only
getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile bs=1024
count=5242880" command. For you folks that run 10GbE to your toasters,
what write speeds are you seeing from your 10GbE connected servers? Did
you have to do any tuning in order to get the best results possible? If so
what did you change?

Thanks!

Dan



_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


dburklan at NMDP

May 20, 2012, 2:12 PM

Post #19 of 23 (6354 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

In regards to the latest "iozone" results, are these more in the ball park
of what I should be seeing? Also why is the re-read throughput value
roughly 20x that of the initial read speed? Would this be caching on the
NFS client side or some sort of caching done by the PAM card on the 6080?
(Should I be running these tests with the "-I" or "Direct IO" argument to
bypass any possible local caching mechanisms?"

Thanks again!

Dan


On 5/19/12 5:32 PM, "Dan Burkland" <dburklan [at] NMDP> wrote:

>I unmounted the NFS share and rebooted the box before running the same
>"iozone" command again. This time I let "iozone" run through all of its
>test (including the read-based ones)
>
>
>Run began: Sat May 19 16:46:27 2012
>
> File size set to 5242880 KB
> Record Size 1024 KB
> Excel chart generation enabled
> Command line used: iozone -s 5g -r 1m -t 16 -R -b
>/root/iozone_mn4s31063_2012-05-d.csv -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
>t11 t12 t13 t14 t15 t16 t17 t18
> Output is in Kbytes/sec
> Time Resolution = 0.000001 seconds.
> Processor cache size set to 1024 Kbytes.
> Processor cache line size set to 32 bytes.
> File stride size set to 17 * record size.
> Throughput test with 16 processes
> Each process writes a 5242880 Kbyte file in 1024 Kbyte records
>
> Children see throughput for 16 initial writers = 349500.55
>KB/sec
> Parent sees throughput for 16 initial writers = 173837.26
>KB/sec
> Min throughput per process = 21147.24
>KB/sec
> Max throughput per process = 22701.06
>KB/sec
> Avg throughput per process = 21843.78
>KB/sec
> Min xfer = 4884480.00 KB
>
> Children see throughput for 16 rewriters = 372333.90
>KB/sec
> Parent sees throughput for 16 rewriters = 179256.38
>KB/sec
> Min throughput per process = 22495.20
>KB/sec
> Max throughput per process = 24418.89
>KB/sec
> Avg throughput per process = 23270.87
>KB/sec
> Min xfer = 4830208.00 KB
>
> Children see throughput for 16 readers = 440115.98
>KB/sec
> Parent sees throughput for 16 readers = 439993.44
>KB/sec
> Min throughput per process = 26406.17
>KB/sec
> Max throughput per process = 28724.05
>KB/sec
> Avg throughput per process = 27507.25
>KB/sec
> Min xfer = 4819968.00 KB
>
> Children see throughput for 16 re-readers = 8953522.06
>KB/sec
> Parent sees throughput for 16 re-readers = 8930475.33
>KB/sec
> Min throughput per process = 408033.34
>KB/sec
> Max throughput per process = 671821.62
>KB/sec
> Avg throughput per process = 559595.13
>KB/sec
> Min xfer = 3186688.00 KB
>
> Children see throughput for 16 reverse readers = 5543829.37
>KB/sec
> Parent sees throughput for 16 reverse readers = 5425986.47
>KB/sec
> Min throughput per process = 15684.29
>KB/sec
> Max throughput per process = 2261884.25
>KB/sec
> Avg throughput per process = 346489.34
>KB/sec
> Min xfer = 36864.00 KB
>
> Children see throughput for 16 stride readers = 16532117.19
>KB/sec
> Parent sees throughput for 16 stride readers = 16272131.55
>KB/sec
> Min throughput per process = 257097.92
>KB/sec
> Max throughput per process = 2256125.75
>KB/sec
> Avg throughput per process = 1033257.32
>KB/sec
> Min xfer = 602112.00 KB
>
> Children see throughput for 16 random readers = 17297437.81
>KB/sec
> Parent sees throughput for 16 random readers = 16871312.92
>KB/sec
> Min throughput per process = 320909.25
>KB/sec
> Max throughput per process = 2083737.75
>KB/sec
> Avg throughput per process = 1081089.86
>KB/sec
> Min xfer = 826368.00 KB
>
> Children see throughput for 16 mixed workload = 10747970.97
>KB/sec
> Parent sees throughput for 16 mixed workload = 112898.07
>KB/sec
> Min throughput per process = 54960.62
>KB/sec
> Max throughput per process = 1991637.38
>KB/sec
> Avg throughput per process = 671748.19
>KB/sec
> Min xfer = 145408.00 KB
>
> Children see throughput for 16 random writers = 358103.29
>KB/sec
> Parent sees throughput for 16 random writers = 166805.09
>KB/sec
> Min throughput per process = 21263.60
>KB/sec
> Max throughput per process = 22942.70
>KB/sec
> Avg throughput per process = 22381.46
>KB/sec
> Min xfer = 4859904.00 KB
>
> Children see throughput for 16 pwrite writers = 325666.64
>KB/sec
> Parent sees throughput for 16 pwrite writers = 177771.50
>KB/sec
> Min throughput per process = 19902.90
>KB/sec
> Max throughput per process = 20863.29
>KB/sec
> Avg throughput per process = 20354.17
>KB/sec
> Min xfer = 5008384.00 KB
>
> Children see throughput for 16 pread readers = 445021.47
>KB/sec
> Parent sees throughput for 16 pread readers = 444618.25
>KB/sec
> Min throughput per process = 26932.47
>KB/sec
> Max throughput per process = 28361.61
>KB/sec
> Avg throughput per process = 27813.84
>KB/sec
> Min xfer = 4981760.00 KB
>
>
>
>"Throughput report Y-axis is type of test X-axis is number of processes"
>"Record size = 1024 Kbytes "
>"Output is in Kbytes/sec"
>
>" Initial write " 349500.55
>
>" Rewrite " 372333.90
>
>" Read " 440115.98
>
>" Re-read " 8953522.06
>
>" Reverse Read " 5543829.37
>
>" Stride read " 16532117.19
>
>" Random read " 17297437.81
>
>" Mixed workload " 10747970.97
>
>" Random write " 358103.29
>
>" Pwrite " 325666.64
>
>" Pread " 445021.47
>
>
>Regards,
>
>Dan
>
>
>On 5/19/12 4:48 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:
>
>>You're now approaching storage write saturation for your box on writes at
>>that rate.
>>
>>Pull reads now.
>>
>>
>>
>>Sent from my iPhone
>>
>>On May 19, 2012, at 2:43 PM, Dan Burkland <dburklan [at] NMDP> wrote:
>>
>>> Here are the IOZone results:
>>>
>>> Run began: Sat May 19 16:22:46 2012
>>>
>>> File size set to 5242880 KB
>>> Record Size 1024 KB
>>> Excel chart generation enabled
>>> Command line used: iozone -s 5g -r 1m -t 16 -R -b
>>> /root/iozone_mn4s31063_2012-05-d.csv -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9
>>>t10
>>> t11 t12 t13 t14 t15 t16 t17 t18
>>> Output is in Kbytes/sec
>>> Time Resolution = 0.000001 seconds.
>>> Processor cache size set to 1024 Kbytes.
>>> Processor cache line size set to 32 bytes.
>>> File stride size set to 17 * record size.
>>> Throughput test with 16 processes
>>> Each process writes a 5242880 Kbyte file in 1024 Kbyte records
>>>
>>> Children see throughput for 16 initial writers = 371306.91
>>>KB/sec
>>> Parent sees throughput for 16 initial writers = 167971.82 KB/sec
>>> Min throughput per process = 21901.84 KB/sec
>>> Max throughput per process = 25333.62 KB/sec
>>> Avg throughput per process = 23206.68 KB/sec
>>> Min xfer = 4533248.00 KB
>>>
>>> Children see throughput for 16 rewriters = 350486.11 KB/sec
>>> Parent sees throughput for 16 rewriters = 176947.47 KB/sec
>>> Min throughput per process = 21154.26 KB/sec
>>> Max throughput per process = 23011.69 KB/sec
>>> Avg throughput per process = 21905.38 KB/sec
>>> Min xfer = 4819968.00 KB
>>>
>>> 362MB/s looks quite a bit higher however can somebody validate that I
>>>am
>>> reading these results correctly? Should I also run "iozone" with the -a
>>> (auto) option for good measure?
>>>
>>> Thanks again for all of your responses, I greatly appreciate it!
>>>
>>>
>>> Dan
>>>
>>>
>>> On 5/19/12 4:36 PM, "Dan Burkland" <dburklan [at] NMDP> wrote:
>>>
>>>> Jeff Mother - Which specific setting are you referring to?
>>>>
>>>> I installed iozone on my test machine and am currently running the
>>>> following iozone command on it:
>>>>
>>>> iozone -s 5g -r 1m -t 16 -R -b /root/iozone_testserver_2012-05-d.csv
>>>>-F
>>>> tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18
>>>>
>>>> I'll post the results once it is finished
>>>>
>>>> Dan
>>>>
>>>>
>>>>
>>>> On 5/19/12 2:44 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:
>>>>
>>>>> Easy one.
>>>>>
>>>>> If it went down in half, adjust your kernel tcp slot count.
>>>>>
>>>>>
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>> On May 19, 2012, at 11:46 AM, Dan Burkland <dburklan [at] NMDP> wrote:
>>>>>
>>>>>> I know dd isn't the best tool since it is a single threaded
>>>>>>application
>>>>>> and in no way represents the workload that Oracle will impose.
>>>>>>However,
>>>>>> I
>>>>>> thought it would still give me a decent ballpark figure regarding
>>>>>> throughput. I tried a block size of 64k, 128k, and 1M (just to see)
>>>>>>and
>>>>>> got a bit more promising results:
>>>>>>
>>>>>> # dd if=/dev/zero of=/mnt/testfile bs=1M count=5120
>>>>>> 5120+0 records in
>>>>>> 5120+0 records out
>>>>>> 5368709120 bytes (5.4 GB) copied, 26.6878 seconds, 201 MB/s
>>>>>>
>>>>>> If I run two of these dd sessions at once the throughput figure
>>>>>>above
>>>>>> gets
>>>>>> cut in half (each dd session reports it creates the file at around
>>>>>> 100MB/s).
>>>>>>
>>>>>> As far as the switch goes, I have not checked it yet however I did
>>>>>> notice
>>>>>> that flow control is set to full on the 6080 10GbE interfaces. We
>>>>>>are
>>>>>> also
>>>>>> running Jumbo Frames on all of the involved equipment.
>>>>>>
>>>>>> As far as the RHEL OS tweaks go, here are the settings that I have
>>>>>> changed
>>>>>> on the system:
>>>>>>
>>>>>> ###
>>>>>> /etc/sysctl.conf:
>>>>>>
>>>>>> # 10GbE Kernel Parameters
>>>>>> net.core.rmem_default = 262144
>>>>>> net.core.rmem_max = 16777216
>>>>>> net.core.wmem_default = 262144
>>>>>> net.core.wmem_max = 16777216
>>>>>> net.ipv4.tcp_rmem = 4096 262144 16777216
>>>>>> net.ipv4.tcp_wmem = 4096 262144 16777216
>>>>>> net.ipv4.tcp_window_scaling = 1
>>>>>> net.ipv4.tcp_syncookies = 0
>>>>>> net.ipv4.tcp_timestamps = 0
>>>>>> net.ipv4.tcp_sack = 0
>>>>>> #
>>>>>>
>>>>>> ###
>>>>>>
>>>>>> ###
>>>>>> /etc/modprobe.d/sunrpc.conf:
>>>>>>
>>>>>>
>>>>>> options sunrpc tcp_slot_table_entries=128
>>>>>>
>>>>>> ###
>>>>>>
>>>>>>
>>>>>> ###
>>>>>> Mount options for the NetApp test NFS share:
>>>>>>
>>>>>>
>>>>>>
>>>>>>rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,
>>>>>>s
>>>>>>ec
>>>>>> =
>>>>>> sy
>>>>>> s
>>>>>>
>>>>>> ###
>>>>>>
>>>>>> Thanks again for all of your quick and detailed responses!
>>>>>>
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 5/19/12 1:08 PM, "Robert McDermott" <rmcdermo [at] fhcrc> wrote:
>>>>>>
>>>>>>> Your block size is only 1K; try increasing the block size and the
>>>>>>> throughput will increase. 1K IOs would generate a lot of IOPs with
>>>>>>> very
>>>>>>> little throughput.
>>>>>>>
>>>>>>> -Robert
>>>>>>>
>>>>>>> Sent from my iPhone
>>>>>>>
>>>>>>> On May 19, 2012, at 10:48, Dan Burkland <dburklan [at] NMDP> wrote:
>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> My company just bought some Intel x520 10GbE cards which I
>>>>>>>>recently
>>>>>>>> installed into our Oracle EBS database servers (IBM 3850 X5s
>>>>>>>>running
>>>>>>>> RHEL
>>>>>>>> 5.8). As the "linux guy" I have been tasked with getting these
>>>>>>>> servers
>>>>>>>> to
>>>>>>>> communicate with our NetApp 6080s via NFS over the new 10GbE
>>>>>>>>links. I
>>>>>>>> have
>>>>>>>> got everything working however ever after tuning the RHEL kernel I
>>>>>>>>am
>>>>>>>> only
>>>>>>>> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
>>>>>>>> bs=1024
>>>>>>>> count=5242880" command. For you folks that run 10GbE to your
>>>>>>>> toasters,
>>>>>>>> what write speeds are you seeing from your 10GbE connected
>>>>>>>>servers?
>>>>>>>> Did
>>>>>>>> you have to do any tuning in order to get the best results
>>>>>>>>possible?
>>>>>>>> If
>>>>>>>> so
>>>>>>>> what did you change?
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Toasters mailing list
>>>>>>>> Toasters [at] teaparty
>>>>>>>> http://www.teaparty.net/mailman/listinfo/toasters
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Toasters mailing list
>>>>>> Toasters [at] teaparty
>>>>>> http://www.teaparty.net/mailman/listinfo/toasters
>>>>
>>>
>>>
>>> _______________________________________________
>>> Toasters mailing list
>>> Toasters [at] teaparty
>>> http://www.teaparty.net/mailman/listinfo/toasters
>
>
>_______________________________________________
>Toasters mailing list
>Toasters [at] teaparty
>http://www.teaparty.net/mailman/listinfo/toasters


_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


speedtoys.racing at gmail

May 20, 2012, 2:17 PM

Post #20 of 23 (6344 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

Re-read is from:

Host system cache
Netapp system cache (or pam)

Direct will bypass host caching..yup.



On Sun, May 20, 2012 at 2:12 PM, Dan Burkland <dburklan [at] nmdp> wrote:

> In regards to the latest "iozone" results, are these more in the ball park
> of what I should be seeing? Also why is the re-read throughput value
> roughly 20x that of the initial read speed? Would this be caching on the
> NFS client side or some sort of caching done by the PAM card on the 6080?
> (Should I be running these tests with the "-I" or "Direct IO" argument to
> bypass any possible local caching mechanisms?"
>
> Thanks again!
>
> Dan
>
>
> On 5/19/12 5:32 PM, "Dan Burkland" <dburklan [at] NMDP> wrote:
>
> >I unmounted the NFS share and rebooted the box before running the same
> >"iozone" command again. This time I let "iozone" run through all of its
> >test (including the read-based ones)
> >
> >
> >Run began: Sat May 19 16:46:27 2012
> >
> > File size set to 5242880 KB
> > Record Size 1024 KB
> > Excel chart generation enabled
> > Command line used: iozone -s 5g -r 1m -t 16 -R -b
> >/root/iozone_mn4s31063_2012-05-d.csv -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
> >t11 t12 t13 t14 t15 t16 t17 t18
> > Output is in Kbytes/sec
> > Time Resolution = 0.000001 seconds.
> > Processor cache size set to 1024 Kbytes.
> > Processor cache line size set to 32 bytes.
> > File stride size set to 17 * record size.
> > Throughput test with 16 processes
> > Each process writes a 5242880 Kbyte file in 1024 Kbyte records
> >
> > Children see throughput for 16 initial writers = 349500.55
> >KB/sec
> > Parent sees throughput for 16 initial writers = 173837.26
> >KB/sec
> > Min throughput per process = 21147.24
> >KB/sec
> > Max throughput per process = 22701.06
> >KB/sec
> > Avg throughput per process = 21843.78
> >KB/sec
> > Min xfer = 4884480.00 KB
> >
> > Children see throughput for 16 rewriters = 372333.90
> >KB/sec
> > Parent sees throughput for 16 rewriters = 179256.38
> >KB/sec
> > Min throughput per process = 22495.20
> >KB/sec
> > Max throughput per process = 24418.89
> >KB/sec
> > Avg throughput per process = 23270.87
> >KB/sec
> > Min xfer = 4830208.00 KB
> >
> > Children see throughput for 16 readers = 440115.98
> >KB/sec
> > Parent sees throughput for 16 readers = 439993.44
> >KB/sec
> > Min throughput per process = 26406.17
> >KB/sec
> > Max throughput per process = 28724.05
> >KB/sec
> > Avg throughput per process = 27507.25
> >KB/sec
> > Min xfer = 4819968.00 KB
> >
> > Children see throughput for 16 re-readers = 8953522.06
> >KB/sec
> > Parent sees throughput for 16 re-readers = 8930475.33
> >KB/sec
> > Min throughput per process = 408033.34
> >KB/sec
> > Max throughput per process = 671821.62
> >KB/sec
> > Avg throughput per process = 559595.13
> >KB/sec
> > Min xfer = 3186688.00 KB
> >
> > Children see throughput for 16 reverse readers = 5543829.37
> >KB/sec
> > Parent sees throughput for 16 reverse readers = 5425986.47
> >KB/sec
> > Min throughput per process = 15684.29
> >KB/sec
> > Max throughput per process = 2261884.25
> >KB/sec
> > Avg throughput per process = 346489.34
> >KB/sec
> > Min xfer = 36864.00 KB
> >
> > Children see throughput for 16 stride readers = 16532117.19
> >KB/sec
> > Parent sees throughput for 16 stride readers = 16272131.55
> >KB/sec
> > Min throughput per process = 257097.92
> >KB/sec
> > Max throughput per process = 2256125.75
> >KB/sec
> > Avg throughput per process = 1033257.32
> >KB/sec
> > Min xfer = 602112.00 KB
> >
> > Children see throughput for 16 random readers = 17297437.81
> >KB/sec
> > Parent sees throughput for 16 random readers = 16871312.92
> >KB/sec
> > Min throughput per process = 320909.25
> >KB/sec
> > Max throughput per process = 2083737.75
> >KB/sec
> > Avg throughput per process = 1081089.86
> >KB/sec
> > Min xfer = 826368.00 KB
> >
> > Children see throughput for 16 mixed workload = 10747970.97
> >KB/sec
> > Parent sees throughput for 16 mixed workload = 112898.07
> >KB/sec
> > Min throughput per process = 54960.62
> >KB/sec
> > Max throughput per process = 1991637.38
> >KB/sec
> > Avg throughput per process = 671748.19
> >KB/sec
> > Min xfer = 145408.00 KB
> >
> > Children see throughput for 16 random writers = 358103.29
> >KB/sec
> > Parent sees throughput for 16 random writers = 166805.09
> >KB/sec
> > Min throughput per process = 21263.60
> >KB/sec
> > Max throughput per process = 22942.70
> >KB/sec
> > Avg throughput per process = 22381.46
> >KB/sec
> > Min xfer = 4859904.00 KB
> >
> > Children see throughput for 16 pwrite writers = 325666.64
> >KB/sec
> > Parent sees throughput for 16 pwrite writers = 177771.50
> >KB/sec
> > Min throughput per process = 19902.90
> >KB/sec
> > Max throughput per process = 20863.29
> >KB/sec
> > Avg throughput per process = 20354.17
> >KB/sec
> > Min xfer = 5008384.00 KB
> >
> > Children see throughput for 16 pread readers = 445021.47
> >KB/sec
> > Parent sees throughput for 16 pread readers = 444618.25
> >KB/sec
> > Min throughput per process = 26932.47
> >KB/sec
> > Max throughput per process = 28361.61
> >KB/sec
> > Avg throughput per process = 27813.84
> >KB/sec
> > Min xfer = 4981760.00 KB
> >
> >
> >
> >"Throughput report Y-axis is type of test X-axis is number of processes"
> >"Record size = 1024 Kbytes "
> >"Output is in Kbytes/sec"
> >
> >" Initial write " 349500.55
> >
> >" Rewrite " 372333.90
> >
> >" Read " 440115.98
> >
> >" Re-read " 8953522.06
> >
> >" Reverse Read " 5543829.37
> >
> >" Stride read " 16532117.19
> >
> >" Random read " 17297437.81
> >
> >" Mixed workload " 10747970.97
> >
> >" Random write " 358103.29
> >
> >" Pwrite " 325666.64
> >
> >" Pread " 445021.47
> >
> >
> >Regards,
> >
> >Dan
> >
> >
> >On 5/19/12 4:48 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:
> >
> >>You're now approaching storage write saturation for your box on writes at
> >>that rate.
> >>
> >>Pull reads now.
> >>
> >>
> >>
> >>Sent from my iPhone
> >>
> >>On May 19, 2012, at 2:43 PM, Dan Burkland <dburklan [at] NMDP> wrote:
> >>
> >>> Here are the IOZone results:
> >>>
> >>> Run began: Sat May 19 16:22:46 2012
> >>>
> >>> File size set to 5242880 KB
> >>> Record Size 1024 KB
> >>> Excel chart generation enabled
> >>> Command line used: iozone -s 5g -r 1m -t 16 -R -b
> >>> /root/iozone_mn4s31063_2012-05-d.csv -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9
> >>>t10
> >>> t11 t12 t13 t14 t15 t16 t17 t18
> >>> Output is in Kbytes/sec
> >>> Time Resolution = 0.000001 seconds.
> >>> Processor cache size set to 1024 Kbytes.
> >>> Processor cache line size set to 32 bytes.
> >>> File stride size set to 17 * record size.
> >>> Throughput test with 16 processes
> >>> Each process writes a 5242880 Kbyte file in 1024 Kbyte records
> >>>
> >>> Children see throughput for 16 initial writers = 371306.91
> >>>KB/sec
> >>> Parent sees throughput for 16 initial writers = 167971.82 KB/sec
> >>> Min throughput per process = 21901.84 KB/sec
> >>> Max throughput per process = 25333.62 KB/sec
> >>> Avg throughput per process = 23206.68 KB/sec
> >>> Min xfer = 4533248.00 KB
> >>>
> >>> Children see throughput for 16 rewriters = 350486.11 KB/sec
> >>> Parent sees throughput for 16 rewriters = 176947.47 KB/sec
> >>> Min throughput per process = 21154.26 KB/sec
> >>> Max throughput per process = 23011.69 KB/sec
> >>> Avg throughput per process = 21905.38 KB/sec
> >>> Min xfer = 4819968.00 KB
> >>>
> >>> 362MB/s looks quite a bit higher however can somebody validate that I
> >>>am
> >>> reading these results correctly? Should I also run "iozone" with the -a
> >>> (auto) option for good measure?
> >>>
> >>> Thanks again for all of your responses, I greatly appreciate it!
> >>>
> >>>
> >>> Dan
> >>>
> >>>
> >>> On 5/19/12 4:36 PM, "Dan Burkland" <dburklan [at] NMDP> wrote:
> >>>
> >>>> Jeff Mother - Which specific setting are you referring to?
> >>>>
> >>>> I installed iozone on my test machine and am currently running the
> >>>> following iozone command on it:
> >>>>
> >>>> iozone -s 5g -r 1m -t 16 -R -b /root/iozone_testserver_2012-05-d.csv
> >>>>-F
> >>>> tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18
> >>>>
> >>>> I'll post the results once it is finished
> >>>>
> >>>> Dan
> >>>>
> >>>>
> >>>>
> >>>> On 5/19/12 2:44 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:
> >>>>
> >>>>> Easy one.
> >>>>>
> >>>>> If it went down in half, adjust your kernel tcp slot count.
> >>>>>
> >>>>>
> >>>>>
> >>>>> Sent from my iPhone
> >>>>>
> >>>>> On May 19, 2012, at 11:46 AM, Dan Burkland <dburklan [at] NMDP>
> wrote:
> >>>>>
> >>>>>> I know dd isn't the best tool since it is a single threaded
> >>>>>>application
> >>>>>> and in no way represents the workload that Oracle will impose.
> >>>>>>However,
> >>>>>> I
> >>>>>> thought it would still give me a decent ballpark figure regarding
> >>>>>> throughput. I tried a block size of 64k, 128k, and 1M (just to see)
> >>>>>>and
> >>>>>> got a bit more promising results:
> >>>>>>
> >>>>>> # dd if=/dev/zero of=/mnt/testfile bs=1M count=5120
> >>>>>> 5120+0 records in
> >>>>>> 5120+0 records out
> >>>>>> 5368709120 bytes (5.4 GB) copied, 26.6878 seconds, 201 MB/s
> >>>>>>
> >>>>>> If I run two of these dd sessions at once the throughput figure
> >>>>>>above
> >>>>>> gets
> >>>>>> cut in half (each dd session reports it creates the file at around
> >>>>>> 100MB/s).
> >>>>>>
> >>>>>> As far as the switch goes, I have not checked it yet however I did
> >>>>>> notice
> >>>>>> that flow control is set to full on the 6080 10GbE interfaces. We
> >>>>>>are
> >>>>>> also
> >>>>>> running Jumbo Frames on all of the involved equipment.
> >>>>>>
> >>>>>> As far as the RHEL OS tweaks go, here are the settings that I have
> >>>>>> changed
> >>>>>> on the system:
> >>>>>>
> >>>>>> ###
> >>>>>> /etc/sysctl.conf:
> >>>>>>
> >>>>>> # 10GbE Kernel Parameters
> >>>>>> net.core.rmem_default = 262144
> >>>>>> net.core.rmem_max = 16777216
> >>>>>> net.core.wmem_default = 262144
> >>>>>> net.core.wmem_max = 16777216
> >>>>>> net.ipv4.tcp_rmem = 4096 262144 16777216
> >>>>>> net.ipv4.tcp_wmem = 4096 262144 16777216
> >>>>>> net.ipv4.tcp_window_scaling = 1
> >>>>>> net.ipv4.tcp_syncookies = 0
> >>>>>> net.ipv4.tcp_timestamps = 0
> >>>>>> net.ipv4.tcp_sack = 0
> >>>>>> #
> >>>>>>
> >>>>>> ###
> >>>>>>
> >>>>>> ###
> >>>>>> /etc/modprobe.d/sunrpc.conf:
> >>>>>>
> >>>>>>
> >>>>>> options sunrpc tcp_slot_table_entries=128
> >>>>>>
> >>>>>> ###
> >>>>>>
> >>>>>>
> >>>>>> ###
> >>>>>> Mount options for the NetApp test NFS share:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,
> >>>>>>s
> >>>>>>ec
> >>>>>> =
> >>>>>> sy
> >>>>>> s
> >>>>>>
> >>>>>> ###
> >>>>>>
> >>>>>> Thanks again for all of your quick and detailed responses!
> >>>>>>
> >>>>>>
> >>>>>> Dan
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 5/19/12 1:08 PM, "Robert McDermott" <rmcdermo [at] fhcrc> wrote:
> >>>>>>
> >>>>>>> Your block size is only 1K; try increasing the block size and the
> >>>>>>> throughput will increase. 1K IOs would generate a lot of IOPs with
> >>>>>>> very
> >>>>>>> little throughput.
> >>>>>>>
> >>>>>>> -Robert
> >>>>>>>
> >>>>>>> Sent from my iPhone
> >>>>>>>
> >>>>>>> On May 19, 2012, at 10:48, Dan Burkland <dburklan [at] NMDP> wrote:
> >>>>>>>
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> My company just bought some Intel x520 10GbE cards which I
> >>>>>>>>recently
> >>>>>>>> installed into our Oracle EBS database servers (IBM 3850 X5s
> >>>>>>>>running
> >>>>>>>> RHEL
> >>>>>>>> 5.8). As the "linux guy" I have been tasked with getting these
> >>>>>>>> servers
> >>>>>>>> to
> >>>>>>>> communicate with our NetApp 6080s via NFS over the new 10GbE
> >>>>>>>>links. I
> >>>>>>>> have
> >>>>>>>> got everything working however ever after tuning the RHEL kernel I
> >>>>>>>>am
> >>>>>>>> only
> >>>>>>>> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
> >>>>>>>> bs=1024
> >>>>>>>> count=5242880" command. For you folks that run 10GbE to your
> >>>>>>>> toasters,
> >>>>>>>> what write speeds are you seeing from your 10GbE connected
> >>>>>>>>servers?
> >>>>>>>> Did
> >>>>>>>> you have to do any tuning in order to get the best results
> >>>>>>>>possible?
> >>>>>>>> If
> >>>>>>>> so
> >>>>>>>> what did you change?
> >>>>>>>>
> >>>>>>>> Thanks!
> >>>>>>>>
> >>>>>>>> Dan
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> Toasters mailing list
> >>>>>>>> Toasters [at] teaparty
> >>>>>>>> http://www.teaparty.net/mailman/listinfo/toasters
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Toasters mailing list
> >>>>>> Toasters [at] teaparty
> >>>>>> http://www.teaparty.net/mailman/listinfo/toasters
> >>>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Toasters mailing list
> >>> Toasters [at] teaparty
> >>> http://www.teaparty.net/mailman/listinfo/toasters
> >
> >
> >_______________________________________________
> >Toasters mailing list
> >Toasters [at] teaparty
> >http://www.teaparty.net/mailman/listinfo/toasters
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters
>



--
---
Gustatus Similis Pullus


dburklan at NMDP

May 20, 2012, 3:53 PM

Post #21 of 23 (6344 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

Here are the results with Direct I/O enabled:

Run began: Sun May 20 16:21:12 2012

File size set to 5242880 KB
Record Size 1024 KB
Excel chart generation enabled
O_DIRECT feature enabled
Command line used: iozone -s 5g -r 1m -t 16 -R -b
/root/iozone_mn4s31063_2012-05-d.csv -I -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9
t10 t11 t12 t13 t14 t15 t16 t17 t18
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Throughput test with 16 processes
Each process writes a 5242880 Kbyte file in 1024 Kbyte records

Children see throughput for 16 initial writers = 262467.29 KB/sec
Parent sees throughput for 16 initial writers = 260324.76 KB/sec
Min throughput per process = 16309.72 KB/sec
Max throughput per process = 16546.15 KB/sec
Avg throughput per process = 16404.21 KB/sec
Min xfer = 5168128.00 KB

Children see throughput for 16 rewriters = 251104.65 KB/sec
Parent sees throughput for 16 rewriters = 251090.95 KB/sec
Min throughput per process = 15546.73 KB/sec
Max throughput per process = 15832.99 KB/sec
Avg throughput per process = 15694.04 KB/sec
Min xfer = 5148672.00 KB

Children see throughput for 16 readers = 619751.30 KB/sec
Parent sees throughput for 16 readers = 619581.97 KB/sec
Min throughput per process = 36595.70 KB/sec
Max throughput per process = 39467.45 KB/sec
Avg throughput per process = 38734.46 KB/sec
Min xfer = 4861952.00 KB

Children see throughput for 16 re-readers = 626421.73 KB/sec
Parent sees throughput for 16 re-readers = 626354.38 KB/sec
Min throughput per process = 37853.47 KB/sec
Max throughput per process = 40021.52 KB/sec
Avg throughput per process = 39151.36 KB/sec
Min xfer = 4959232.00 KB

Children see throughput for 16 reverse readers = 462712.64 KB/sec
Parent sees throughput for 16 reverse readers = 462649.29 KB/sec
Min throughput per process = 27713.84 KB/sec
Max throughput per process = 29794.67 KB/sec
Avg throughput per process = 28919.54 KB/sec
Min xfer = 4877312.00 KB

Children see throughput for 16 stride readers = 520482.83 KB/sec
Parent sees throughput for 16 stride readers = 520448.31 KB/sec
Min throughput per process = 31892.69 KB/sec
Max throughput per process = 33016.53 KB/sec
Avg throughput per process = 32530.18 KB/sec
Min xfer = 5064704.00 KB

Children see throughput for 16 random readers = 544089.98 KB/sec
Parent sees throughput for 16 random readers = 544055.32 KB/sec
Min throughput per process = 33799.79 KB/sec
Max throughput per process = 34304.76 KB/sec
Avg throughput per process = 34005.62 KB/sec
Min xfer = 5166080.00 KB

Children see throughput for 16 mixed workload = 365865.06 KB/sec
Parent sees throughput for 16 mixed workload = 352394.93 KB/sec
Min throughput per process = 22250.01 KB/sec
Max throughput per process = 23576.78 KB/sec
Avg throughput per process = 22866.57 KB/sec
Min xfer = 4947968.00 KB

Children see throughput for 16 random writers = 230192.41 KB/sec
Parent sees throughput for 16 random writers = 229237.34 KB/sec
Min throughput per process = 14307.92 KB/sec
Max throughput per process = 14463.50 KB/sec
Avg throughput per process = 14387.03 KB/sec
Min xfer = 5186560.00 KB

Children see throughput for 16 pwrite writers = 197020.59 KB/sec
Parent sees throughput for 16 pwrite writers = 195973.16 KB/sec
Min throughput per process = 12265.62 KB/sec
Max throughput per process = 12394.86 KB/sec
Avg throughput per process = 12313.79 KB/sec
Min xfer = 5188608.00 KB

Children see throughput for 16 pread readers = 578525.04 KB/sec
Parent sees throughput for 16 pread readers = 578418.73 KB/sec
Min throughput per process = 33046.61 KB/sec
Max throughput per process = 38253.89 KB/sec
Avg throughput per process = 36157.82 KB/sec
Min xfer = 4530176.00 KB



"Throughput report Y-axis is type of test X-axis is number of processes"
"Record size = 1024 Kbytes "
"Output is in Kbytes/sec"

" Initial write " 262467.29

" Rewrite " 251104.65

" Read " 619751.30

" Re-read " 626421.73

" Reverse Read " 462712.64

" Stride read " 520482.83

" Random read " 544089.98

" Mixed workload " 365865.06

" Random write " 230192.41

" Pwrite " 197020.59

" Pread " 578525.04

The read results definitely look more believable now. Are these results
more in line with what I should be seeing? Tomorrow I am going to try and
rule the switches out of the equation by running "netperf" between my two
10GbE test systems.

Dan

From: Jeff Mohler <speedtoys.racing [at] gmail>
Date: Sun, 20 May 2012 16:17:41 -0500
To: Dan Burkland <dburklan [at] nmdp>
Cc: "toasters [at] teaparty" <toasters [at] teaparty>
Subject: Re: Poor NFS 10GbE performance on NetApp 6080s


Re-read is from:

Host system cache
Netapp system cache (or pam)

Direct will bypass host caching..yup.



On Sun, May 20, 2012 at 2:12 PM, Dan Burkland <dburklan [at] nmdp> wrote:

In regards to the latest "iozone" results, are these more in the ball park
of what I should be seeing? Also why is the re-read throughput value
roughly 20x that of the initial read speed? Would this be caching on the
NFS client side or some sort of caching done by the PAM card on the 6080?
(Should I be running these tests with the "-I" or "Direct IO" argument to
bypass any possible local caching mechanisms?"

Thanks again!

Dan


On 5/19/12 5:32 PM, "Dan Burkland" <dburklan [at] NMDP> wrote:

>I unmounted the NFS share and rebooted the box before running the same
>"iozone" command again. This time I let "iozone" run through all of its
>test (including the read-based ones)
>
>
>Run began: Sat May 19 16:46:27 2012
>
> File size set to 5242880 KB
> Record Size 1024 KB
> Excel chart generation enabled
> Command line used: iozone -s 5g -r 1m -t 16 -R -b
>/root/iozone_mn4s31063_2012-05-d.csv -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
>t11 t12 t13 t14 t15 t16 t17 t18
> Output is in Kbytes/sec
> Time Resolution = 0.000001 seconds.
> Processor cache size set to 1024 Kbytes.
> Processor cache line size set to 32 bytes.
> File stride size set to 17 * record size.
> Throughput test with 16 processes
> Each process writes a 5242880 Kbyte file in 1024 Kbyte records
>
> Children see throughput for 16 initial writers = 349500.55
>KB/sec
> Parent sees throughput for 16 initial writers = 173837.26
>KB/sec
> Min throughput per process = 21147.24
>KB/sec
> Max throughput per process = 22701.06
>KB/sec
> Avg throughput per process = 21843.78
>KB/sec
> Min xfer = 4884480.00 KB
>
> Children see throughput for 16 rewriters = 372333.90
>KB/sec
> Parent sees throughput for 16 rewriters = 179256.38
>KB/sec
> Min throughput per process = 22495.20
>KB/sec
> Max throughput per process = 24418.89
>KB/sec
> Avg throughput per process = 23270.87
>KB/sec
> Min xfer = 4830208.00 KB
>
> Children see throughput for 16 readers = 440115.98
>KB/sec
> Parent sees throughput for 16 readers = 439993.44
>KB/sec
> Min throughput per process = 26406.17
>KB/sec
> Max throughput per process = 28724.05
>KB/sec
> Avg throughput per process = 27507.25
>KB/sec
> Min xfer = 4819968.00 KB
>
> Children see throughput for 16 re-readers = 8953522.06
>KB/sec
> Parent sees throughput for 16 re-readers = 8930475.33
>KB/sec
> Min throughput per process = 408033.34
>KB/sec
> Max throughput per process = 671821.62
>KB/sec
> Avg throughput per process = 559595.13
>KB/sec
> Min xfer = 3186688.00 KB
>
> Children see throughput for 16 reverse readers = 5543829.37
>KB/sec
> Parent sees throughput for 16 reverse readers = 5425986.47
>KB/sec
> Min throughput per process = 15684.29
>KB/sec
> Max throughput per process = 2261884.25
>KB/sec
> Avg throughput per process = 346489.34
>KB/sec
> Min xfer = 36864.00 KB
>
> Children see throughput for 16 stride readers = 16532117.19
>KB/sec
> Parent sees throughput for 16 stride readers = 16272131.55
>KB/sec
> Min throughput per process = 257097.92
>KB/sec
> Max throughput per process = 2256125.75
>KB/sec
> Avg throughput per process = 1033257.32
>KB/sec
> Min xfer = 602112.00 KB
>
> Children see throughput for 16 random readers = 17297437.81
>KB/sec
> Parent sees throughput for 16 random readers = 16871312.92
>KB/sec
> Min throughput per process = 320909.25
>KB/sec
> Max throughput per process = 2083737.75
>KB/sec
> Avg throughput per process = 1081089.86
>KB/sec
> Min xfer = 826368.00 KB
>
> Children see throughput for 16 mixed workload = 10747970.97
>KB/sec
> Parent sees throughput for 16 mixed workload = 112898.07
>KB/sec
> Min throughput per process = 54960.62
>KB/sec
> Max throughput per process = 1991637.38
>KB/sec
> Avg throughput per process = 671748.19
>KB/sec
> Min xfer = 145408.00 KB
>
> Children see throughput for 16 random writers = 358103.29
>KB/sec
> Parent sees throughput for 16 random writers = 166805.09
>KB/sec
> Min throughput per process = 21263.60
>KB/sec
> Max throughput per process = 22942.70
>KB/sec
> Avg throughput per process = 22381.46
>KB/sec
> Min xfer = 4859904.00 KB
>
> Children see throughput for 16 pwrite writers = 325666.64
>KB/sec
> Parent sees throughput for 16 pwrite writers = 177771.50
>KB/sec
> Min throughput per process = 19902.90
>KB/sec
> Max throughput per process = 20863.29
>KB/sec
> Avg throughput per process = 20354.17
>KB/sec
> Min xfer = 5008384.00 KB
>
> Children see throughput for 16 pread readers = 445021.47
>KB/sec
> Parent sees throughput for 16 pread readers = 444618.25
>KB/sec
> Min throughput per process = 26932.47
>KB/sec
> Max throughput per process = 28361.61
>KB/sec
> Avg throughput per process = 27813.84
>KB/sec
> Min xfer = 4981760.00 KB
>
>
>
>"Throughput report Y-axis is type of test X-axis is number of processes"
>"Record size = 1024 Kbytes "
>"Output is in Kbytes/sec"
>
>" Initial write " 349500.55
>
>" Rewrite " 372333.90
>
>" Read " 440115.98
>
>" Re-read " 8953522.06
>
>" Reverse Read " 5543829.37
>
>" Stride read " 16532117.19
>
>" Random read " 17297437.81
>
>" Mixed workload " 10747970.97
>
>" Random write " 358103.29
>
>" Pwrite " 325666.64
>
>" Pread " 445021.47
>
>
>Regards,
>
>Dan
>
>
>On 5/19/12 4:48 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:
>
>>You're now approaching storage write saturation for your box on writes at
>>that rate.
>>
>>Pull reads now.
>>
>>
>>
>>Sent from my iPhone
>>
>>On May 19, 2012, at 2:43 PM, Dan Burkland <dburklan [at] NMDP> wrote:
>>
>>> Here are the IOZone results:
>>>
>>> Run began: Sat May 19 16:22:46 2012
>>>
>>> File size set to 5242880 KB
>>> Record Size 1024 KB
>>> Excel chart generation enabled
>>> Command line used: iozone -s 5g -r 1m -t 16 -R -b
>>> /root/iozone_mn4s31063_2012-05-d.csv -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9
>>>t10
>>> t11 t12 t13 t14 t15 t16 t17 t18
>>> Output is in Kbytes/sec
>>> Time Resolution = 0.000001 seconds.
>>> Processor cache size set to 1024 Kbytes.
>>> Processor cache line size set to 32 bytes.
>>> File stride size set to 17 * record size.
>>> Throughput test with 16 processes
>>> Each process writes a 5242880 Kbyte file in 1024 Kbyte records
>>>
>>> Children see throughput for 16 initial writers = 371306.91
>>>KB/sec
>>> Parent sees throughput for 16 initial writers = 167971.82 KB/sec
>>> Min throughput per process = 21901.84 KB/sec
>>> Max throughput per process = 25333.62 KB/sec
>>> Avg throughput per process = 23206.68 KB/sec
>>> Min xfer = 4533248.00 KB
>>>
>>> Children see throughput for 16 rewriters = 350486.11 KB/sec
>>> Parent sees throughput for 16 rewriters = 176947.47 KB/sec
>>> Min throughput per process = 21154.26 KB/sec
>>> Max throughput per process = 23011.69 KB/sec
>>> Avg throughput per process = 21905.38 KB/sec
>>> Min xfer = 4819968.00 KB
>>>
>>> 362MB/s looks quite a bit higher however can somebody validate that I
>>>am
>>> reading these results correctly? Should I also run "iozone" with the -a
>>> (auto) option for good measure?
>>>
>>> Thanks again for all of your responses, I greatly appreciate it!
>>>
>>>
>>> Dan
>>>
>>>
>>> On 5/19/12 4:36 PM, "Dan Burkland" <dburklan [at] NMDP> wrote:
>>>
>>>> Jeff Mother - Which specific setting are you referring to?
>>>>
>>>> I installed iozone on my test machine and am currently running the
>>>> following iozone command on it:
>>>>
>>>> iozone -s 5g -r 1m -t 16 -R -b /root/iozone_testserver_2012-05-d.csv
>>>>-F
>>>> tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18
>>>>
>>>> I'll post the results once it is finished
>>>>
>>>> Dan
>>>>
>>>>
>>>>
>>>> On 5/19/12 2:44 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:
>>>>
>>>>> Easy one.
>>>>>
>>>>> If it went down in half, adjust your kernel tcp slot count.
>>>>>
>>>>>
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>> On May 19, 2012, at 11:46 AM, Dan Burkland <dburklan [at] NMDP> wrote:
>>>>>
>>>>>> I know dd isn't the best tool since it is a single threaded
>>>>>>application
>>>>>> and in no way represents the workload that Oracle will impose.
>>>>>>However,
>>>>>> I
>>>>>> thought it would still give me a decent ballpark figure regarding
>>>>>> throughput. I tried a block size of 64k, 128k, and 1M (just to see)
>>>>>>and
>>>>>> got a bit more promising results:
>>>>>>
>>>>>> # dd if=/dev/zero of=/mnt/testfile bs=1M count=5120
>>>>>> 5120+0 records in
>>>>>> 5120+0 records out
>>>>>> 5368709120 bytes (5.4 GB) copied, 26.6878 seconds, 201 MB/s
>>>>>>
>>>>>> If I run two of these dd sessions at once the throughput figure
>>>>>>above
>>>>>> gets
>>>>>> cut in half (each dd session reports it creates the file at around
>>>>>> 100MB/s).
>>>>>>
>>>>>> As far as the switch goes, I have not checked it yet however I did
>>>>>> notice
>>>>>> that flow control is set to full on the 6080 10GbE interfaces. We
>>>>>>are
>>>>>> also
>>>>>> running Jumbo Frames on all of the involved equipment.
>>>>>>
>>>>>> As far as the RHEL OS tweaks go, here are the settings that I have
>>>>>> changed
>>>>>> on the system:
>>>>>>
>>>>>> ###
>>>>>> /etc/sysctl.conf:
>>>>>>
>>>>>> # 10GbE Kernel Parameters
>>>>>> net.core.rmem_default = 262144
>>>>>> net.core.rmem_max = 16777216
>>>>>> net.core.wmem_default = 262144
>>>>>> net.core.wmem_max = 16777216
>>>>>> net.ipv4.tcp_rmem = 4096 262144 <tel:4096%20262144> 16777216
>>>>>> net.ipv4.tcp_wmem = 4096 262144 <tel:4096%20262144> 16777216
>>>>>> net.ipv4.tcp_window_scaling = 1
>>>>>> net.ipv4.tcp_syncookies = 0
>>>>>> net.ipv4.tcp_timestamps = 0
>>>>>> net.ipv4.tcp_sack = 0
>>>>>> #
>>>>>>
>>>>>> ###
>>>>>>
>>>>>> ###
>>>>>> /etc/modprobe.d/sunrpc.conf:
>>>>>>
>>>>>>
>>>>>> options sunrpc tcp_slot_table_entries=128
>>>>>>
>>>>>> ###
>>>>>>
>>>>>>
>>>>>> ###
>>>>>> Mount options for the NetApp test NFS share:
>>>>>>
>>>>>>
>>>>>>
>>>>>>rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,
>>>>>>s
>>>>>>ec
>>>>>> =
>>>>>> sy
>>>>>> s
>>>>>>
>>>>>> ###
>>>>>>
>>>>>> Thanks again for all of your quick and detailed responses!
>>>>>>
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 5/19/12 1:08 PM, "Robert McDermott" <rmcdermo [at] fhcrc> wrote:
>>>>>>
>>>>>>> Your block size is only 1K; try increasing the block size and the
>>>>>>> throughput will increase. 1K IOs would generate a lot of IOPs with
>>>>>>> very
>>>>>>> little throughput.
>>>>>>>
>>>>>>> -Robert
>>>>>>>
>>>>>>> Sent from my iPhone
>>>>>>>
>>>>>>> On May 19, 2012, at 10:48, Dan Burkland <dburklan [at] NMDP> wrote:
>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> My company just bought some Intel x520 10GbE cards which I
>>>>>>>>recently
>>>>>>>> installed into our Oracle EBS database servers (IBM 3850 X5s
>>>>>>>>running
>>>>>>>> RHEL
>>>>>>>> 5.8). As the "linux guy" I have been tasked with getting these
>>>>>>>> servers
>>>>>>>> to
>>>>>>>> communicate with our NetApp 6080s via NFS over the new 10GbE
>>>>>>>>links. I
>>>>>>>> have
>>>>>>>> got everything working however ever after tuning the RHEL kernel I
>>>>>>>>am
>>>>>>>> only
>>>>>>>> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
>>>>>>>> bs=1024
>>>>>>>> count=5242880" command. For you folks that run 10GbE to your
>>>>>>>> toasters,
>>>>>>>> what write speeds are you seeing from your 10GbE connected
>>>>>>>>servers?
>>>>>>>> Did
>>>>>>>> you have to do any tuning in order to get the best results
>>>>>>>>possible?
>>>>>>>> If
>>>>>>>> so
>>>>>>>> what did you change?
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Toasters mailing list
>>>>>>>> Toasters [at] teaparty
>>>>>>>> http://www.teaparty.net/mailman/listinfo/toasters
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Toasters mailing list
>>>>>> Toasters [at] teaparty
>>>>>> http://www.teaparty.net/mailman/listinfo/toasters
>>>>
>>>
>>>
>>> _______________________________________________
>>> Toasters mailing list
>>> Toasters [at] teaparty
>>> http://www.teaparty.net/mailman/listinfo/toasters
>
>
>_______________________________________________
>Toasters mailing list
>Toasters [at] teaparty
>http://www.teaparty.net/mailman/listinfo/toasters


_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters








--
---
Gustatus Similis Pullus


_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


speedtoys.racing at gmail

May 20, 2012, 4:26 PM

Post #22 of 23 (6334 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

Would have to see perfstat (statit and a few other things) to know..but you
are much more in line with reality...you're square in the ballpark.



On Sun, May 20, 2012 at 3:53 PM, Dan Burkland <dburklan [at] nmdp> wrote:

> Here are the results with Direct I/O enabled:
>
> Run began: Sun May 20 16:21:12 2012
>
> File size set to 5242880 KB
> Record Size 1024 KB
> Excel chart generation enabled
> O_DIRECT feature enabled
> Command line used: iozone -s 5g -r 1m -t 16 -R -b
> /root/iozone_mn4s31063_2012-05-d.csv -I -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9
> t10 t11 t12 t13 t14 t15 t16 t17 t18
> Output is in Kbytes/sec
> Time Resolution = 0.000001 seconds.
> Processor cache size set to 1024 Kbytes.
> Processor cache line size set to 32 bytes.
> File stride size set to 17 * record size.
> Throughput test with 16 processes
> Each process writes a 5242880 Kbyte file in 1024 Kbyte records
>
> Children see throughput for 16 initial writers = 262467.29 KB/sec
> Parent sees throughput for 16 initial writers = 260324.76 KB/sec
> Min throughput per process = 16309.72 KB/sec
> Max throughput per process = 16546.15 KB/sec
> Avg throughput per process = 16404.21 KB/sec
> Min xfer = 5168128.00 KB
>
> Children see throughput for 16 rewriters = 251104.65 KB/sec
> Parent sees throughput for 16 rewriters = 251090.95 KB/sec
> Min throughput per process = 15546.73 KB/sec
> Max throughput per process = 15832.99 KB/sec
> Avg throughput per process = 15694.04 KB/sec
> Min xfer = 5148672.00 KB
>
> Children see throughput for 16 readers = 619751.30 KB/sec
> Parent sees throughput for 16 readers = 619581.97 KB/sec
> Min throughput per process = 36595.70 KB/sec
> Max throughput per process = 39467.45 KB/sec
> Avg throughput per process = 38734.46 KB/sec
> Min xfer = 4861952.00 KB
>
> Children see throughput for 16 re-readers = 626421.73 KB/sec
> Parent sees throughput for 16 re-readers = 626354.38 KB/sec
> Min throughput per process = 37853.47 KB/sec
> Max throughput per process = 40021.52 KB/sec
> Avg throughput per process = 39151.36 KB/sec
> Min xfer = 4959232.00 KB
>
> Children see throughput for 16 reverse readers = 462712.64 KB/sec
> Parent sees throughput for 16 reverse readers = 462649.29 KB/sec
> Min throughput per process = 27713.84 KB/sec
> Max throughput per process = 29794.67 KB/sec
> Avg throughput per process = 28919.54 KB/sec
> Min xfer = 4877312.00 KB
>
> Children see throughput for 16 stride readers = 520482.83 KB/sec
> Parent sees throughput for 16 stride readers = 520448.31 KB/sec
> Min throughput per process = 31892.69 KB/sec
> Max throughput per process = 33016.53 KB/sec
> Avg throughput per process = 32530.18 KB/sec
> Min xfer = 5064704.00 KB
>
> Children see throughput for 16 random readers = 544089.98 KB/sec
> Parent sees throughput for 16 random readers = 544055.32 KB/sec
> Min throughput per process = 33799.79 KB/sec
> Max throughput per process = 34304.76 KB/sec
> Avg throughput per process = 34005.62 KB/sec
> Min xfer = 5166080.00 KB
>
> Children see throughput for 16 mixed workload = 365865.06 KB/sec
> Parent sees throughput for 16 mixed workload = 352394.93 KB/sec
> Min throughput per process = 22250.01 KB/sec
> Max throughput per process = 23576.78 KB/sec
> Avg throughput per process = 22866.57 KB/sec
> Min xfer = 4947968.00 KB
>
> Children see throughput for 16 random writers = 230192.41 KB/sec
> Parent sees throughput for 16 random writers = 229237.34 KB/sec
> Min throughput per process = 14307.92 KB/sec
> Max throughput per process = 14463.50 KB/sec
> Avg throughput per process = 14387.03 KB/sec
> Min xfer = 5186560.00 KB
>
> Children see throughput for 16 pwrite writers = 197020.59 KB/sec
> Parent sees throughput for 16 pwrite writers = 195973.16 KB/sec
> Min throughput per process = 12265.62 KB/sec
> Max throughput per process = 12394.86 KB/sec
> Avg throughput per process = 12313.79 KB/sec
> Min xfer = 5188608.00 KB
>
> Children see throughput for 16 pread readers = 578525.04 KB/sec
> Parent sees throughput for 16 pread readers = 578418.73 KB/sec
> Min throughput per process = 33046.61 KB/sec
> Max throughput per process = 38253.89 KB/sec
> Avg throughput per process = 36157.82 KB/sec
> Min xfer = 4530176.00 KB
>
>
>
> "Throughput report Y-axis is type of test X-axis is number of processes"
> "Record size = 1024 Kbytes "
> "Output is in Kbytes/sec"
>
> " Initial write " 262467.29
>
> " Rewrite " 251104.65
>
> " Read " 619751.30
>
> " Re-read " 626421.73
>
> " Reverse Read " 462712.64
>
> " Stride read " 520482.83
>
> " Random read " 544089.98
>
> " Mixed workload " 365865.06
>
> " Random write " 230192.41
>
> " Pwrite " 197020.59
>
> " Pread " 578525.04
>
> The read results definitely look more believable now. Are these results
> more in line with what I should be seeing? Tomorrow I am going to try and
> rule the switches out of the equation by running "netperf" between my two
> 10GbE test systems.
>
> Dan
>
> From: Jeff Mohler <speedtoys.racing [at] gmail>
> Date: Sun, 20 May 2012 16:17:41 -0500
> To: Dan Burkland <dburklan [at] nmdp>
> Cc: "toasters [at] teaparty" <toasters [at] teaparty>
> Subject: Re: Poor NFS 10GbE performance on NetApp 6080s
>
>
> Re-read is from:
>
> Host system cache
> Netapp system cache (or pam)
>
> Direct will bypass host caching..yup.
>
>
>
> On Sun, May 20, 2012 at 2:12 PM, Dan Burkland <dburklan [at] nmdp> wrote:
>
> In regards to the latest "iozone" results, are these more in the ball park
> of what I should be seeing? Also why is the re-read throughput value
> roughly 20x that of the initial read speed? Would this be caching on the
> NFS client side or some sort of caching done by the PAM card on the 6080?
> (Should I be running these tests with the "-I" or "Direct IO" argument to
> bypass any possible local caching mechanisms?"
>
> Thanks again!
>
> Dan
>
>
> On 5/19/12 5:32 PM, "Dan Burkland" <dburklan [at] NMDP> wrote:
>
> >I unmounted the NFS share and rebooted the box before running the same
> >"iozone" command again. This time I let "iozone" run through all of its
> >test (including the read-based ones)
> >
> >
> >Run began: Sat May 19 16:46:27 2012
> >
> > File size set to 5242880 KB
> > Record Size 1024 KB
> > Excel chart generation enabled
> > Command line used: iozone -s 5g -r 1m -t 16 -R -b
> >/root/iozone_mn4s31063_2012-05-d.csv -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
> >t11 t12 t13 t14 t15 t16 t17 t18
> > Output is in Kbytes/sec
> > Time Resolution = 0.000001 seconds.
> > Processor cache size set to 1024 Kbytes.
> > Processor cache line size set to 32 bytes.
> > File stride size set to 17 * record size.
> > Throughput test with 16 processes
> > Each process writes a 5242880 Kbyte file in 1024 Kbyte records
> >
> > Children see throughput for 16 initial writers = 349500.55
> >KB/sec
> > Parent sees throughput for 16 initial writers = 173837.26
> >KB/sec
> > Min throughput per process = 21147.24
> >KB/sec
> > Max throughput per process = 22701.06
> >KB/sec
> > Avg throughput per process = 21843.78
> >KB/sec
> > Min xfer = 4884480.00 KB
> >
> > Children see throughput for 16 rewriters = 372333.90
> >KB/sec
> > Parent sees throughput for 16 rewriters = 179256.38
> >KB/sec
> > Min throughput per process = 22495.20
> >KB/sec
> > Max throughput per process = 24418.89
> >KB/sec
> > Avg throughput per process = 23270.87
> >KB/sec
> > Min xfer = 4830208.00 KB
> >
> > Children see throughput for 16 readers = 440115.98
> >KB/sec
> > Parent sees throughput for 16 readers = 439993.44
> >KB/sec
> > Min throughput per process = 26406.17
> >KB/sec
> > Max throughput per process = 28724.05
> >KB/sec
> > Avg throughput per process = 27507.25
> >KB/sec
> > Min xfer = 4819968.00 KB
> >
> > Children see throughput for 16 re-readers = 8953522.06
> >KB/sec
> > Parent sees throughput for 16 re-readers = 8930475.33
> >KB/sec
> > Min throughput per process = 408033.34
> >KB/sec
> > Max throughput per process = 671821.62
> >KB/sec
> > Avg throughput per process = 559595.13
> >KB/sec
> > Min xfer = 3186688.00 KB
> >
> > Children see throughput for 16 reverse readers = 5543829.37
> >KB/sec
> > Parent sees throughput for 16 reverse readers = 5425986.47
> >KB/sec
> > Min throughput per process = 15684.29
> >KB/sec
> > Max throughput per process = 2261884.25
> >KB/sec
> > Avg throughput per process = 346489.34
> >KB/sec
> > Min xfer = 36864.00 KB
> >
> > Children see throughput for 16 stride readers = 16532117.19
> >KB/sec
> > Parent sees throughput for 16 stride readers = 16272131.55
> >KB/sec
> > Min throughput per process = 257097.92
> >KB/sec
> > Max throughput per process = 2256125.75
> >KB/sec
> > Avg throughput per process = 1033257.32
> >KB/sec
> > Min xfer = 602112.00 KB
> >
> > Children see throughput for 16 random readers = 17297437.81
> >KB/sec
> > Parent sees throughput for 16 random readers = 16871312.92
> >KB/sec
> > Min throughput per process = 320909.25
> >KB/sec
> > Max throughput per process = 2083737.75
> >KB/sec
> > Avg throughput per process = 1081089.86
> >KB/sec
> > Min xfer = 826368.00 KB
> >
> > Children see throughput for 16 mixed workload = 10747970.97
> >KB/sec
> > Parent sees throughput for 16 mixed workload = 112898.07
> >KB/sec
> > Min throughput per process = 54960.62
> >KB/sec
> > Max throughput per process = 1991637.38
> >KB/sec
> > Avg throughput per process = 671748.19
> >KB/sec
> > Min xfer = 145408.00 KB
> >
> > Children see throughput for 16 random writers = 358103.29
> >KB/sec
> > Parent sees throughput for 16 random writers = 166805.09
> >KB/sec
> > Min throughput per process = 21263.60
> >KB/sec
> > Max throughput per process = 22942.70
> >KB/sec
> > Avg throughput per process = 22381.46
> >KB/sec
> > Min xfer = 4859904.00 KB
> >
> > Children see throughput for 16 pwrite writers = 325666.64
> >KB/sec
> > Parent sees throughput for 16 pwrite writers = 177771.50
> >KB/sec
> > Min throughput per process = 19902.90
> >KB/sec
> > Max throughput per process = 20863.29
> >KB/sec
> > Avg throughput per process = 20354.17
> >KB/sec
> > Min xfer = 5008384.00 KB
> >
> > Children see throughput for 16 pread readers = 445021.47
> >KB/sec
> > Parent sees throughput for 16 pread readers = 444618.25
> >KB/sec
> > Min throughput per process = 26932.47
> >KB/sec
> > Max throughput per process = 28361.61
> >KB/sec
> > Avg throughput per process = 27813.84
> >KB/sec
> > Min xfer = 4981760.00 KB
> >
> >
> >
> >"Throughput report Y-axis is type of test X-axis is number of processes"
> >"Record size = 1024 Kbytes "
> >"Output is in Kbytes/sec"
> >
> >" Initial write " 349500.55
> >
> >" Rewrite " 372333.90
> >
> >" Read " 440115.98
> >
> >" Re-read " 8953522.06
> >
> >" Reverse Read " 5543829.37
> >
> >" Stride read " 16532117.19
> >
> >" Random read " 17297437.81
> >
> >" Mixed workload " 10747970.97
> >
> >" Random write " 358103.29
> >
> >" Pwrite " 325666.64
> >
> >" Pread " 445021.47
> >
> >
> >Regards,
> >
> >Dan
> >
> >
> >On 5/19/12 4:48 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:
> >
> >>You're now approaching storage write saturation for your box on writes at
> >>that rate.
> >>
> >>Pull reads now.
> >>
> >>
> >>
> >>Sent from my iPhone
> >>
> >>On May 19, 2012, at 2:43 PM, Dan Burkland <dburklan [at] NMDP> wrote:
> >>
> >>> Here are the IOZone results:
> >>>
> >>> Run began: Sat May 19 16:22:46 2012
> >>>
> >>> File size set to 5242880 KB
> >>> Record Size 1024 KB
> >>> Excel chart generation enabled
> >>> Command line used: iozone -s 5g -r 1m -t 16 -R -b
> >>> /root/iozone_mn4s31063_2012-05-d.csv -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9
> >>>t10
> >>> t11 t12 t13 t14 t15 t16 t17 t18
> >>> Output is in Kbytes/sec
> >>> Time Resolution = 0.000001 seconds.
> >>> Processor cache size set to 1024 Kbytes.
> >>> Processor cache line size set to 32 bytes.
> >>> File stride size set to 17 * record size.
> >>> Throughput test with 16 processes
> >>> Each process writes a 5242880 Kbyte file in 1024 Kbyte records
> >>>
> >>> Children see throughput for 16 initial writers = 371306.91
> >>>KB/sec
> >>> Parent sees throughput for 16 initial writers = 167971.82 KB/sec
> >>> Min throughput per process = 21901.84 KB/sec
> >>> Max throughput per process = 25333.62 KB/sec
> >>> Avg throughput per process = 23206.68 KB/sec
> >>> Min xfer = 4533248.00 KB
> >>>
> >>> Children see throughput for 16 rewriters = 350486.11 KB/sec
> >>> Parent sees throughput for 16 rewriters = 176947.47 KB/sec
> >>> Min throughput per process = 21154.26 KB/sec
> >>> Max throughput per process = 23011.69 KB/sec
> >>> Avg throughput per process = 21905.38 KB/sec
> >>> Min xfer = 4819968.00 KB
> >>>
> >>> 362MB/s looks quite a bit higher however can somebody validate that I
> >>>am
> >>> reading these results correctly? Should I also run "iozone" with the -a
> >>> (auto) option for good measure?
> >>>
> >>> Thanks again for all of your responses, I greatly appreciate it!
> >>>
> >>>
> >>> Dan
> >>>
> >>>
> >>> On 5/19/12 4:36 PM, "Dan Burkland" <dburklan [at] NMDP> wrote:
> >>>
> >>>> Jeff Mother - Which specific setting are you referring to?
> >>>>
> >>>> I installed iozone on my test machine and am currently running the
> >>>> following iozone command on it:
> >>>>
> >>>> iozone -s 5g -r 1m -t 16 -R -b /root/iozone_testserver_2012-05-d.csv
> >>>>-F
> >>>> tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18
> >>>>
> >>>> I'll post the results once it is finished
> >>>>
> >>>> Dan
> >>>>
> >>>>
> >>>>
> >>>> On 5/19/12 2:44 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:
> >>>>
> >>>>> Easy one.
> >>>>>
> >>>>> If it went down in half, adjust your kernel tcp slot count.
> >>>>>
> >>>>>
> >>>>>
> >>>>> Sent from my iPhone
> >>>>>
> >>>>> On May 19, 2012, at 11:46 AM, Dan Burkland <dburklan [at] NMDP>
> wrote:
> >>>>>
> >>>>>> I know dd isn't the best tool since it is a single threaded
> >>>>>>application
> >>>>>> and in no way represents the workload that Oracle will impose.
> >>>>>>However,
> >>>>>> I
> >>>>>> thought it would still give me a decent ballpark figure regarding
> >>>>>> throughput. I tried a block size of 64k, 128k, and 1M (just to see)
> >>>>>>and
> >>>>>> got a bit more promising results:
> >>>>>>
> >>>>>> # dd if=/dev/zero of=/mnt/testfile bs=1M count=5120
> >>>>>> 5120+0 records in
> >>>>>> 5120+0 records out
> >>>>>> 5368709120 bytes (5.4 GB) copied, 26.6878 seconds, 201 MB/s
> >>>>>>
> >>>>>> If I run two of these dd sessions at once the throughput figure
> >>>>>>above
> >>>>>> gets
> >>>>>> cut in half (each dd session reports it creates the file at around
> >>>>>> 100MB/s).
> >>>>>>
> >>>>>> As far as the switch goes, I have not checked it yet however I did
> >>>>>> notice
> >>>>>> that flow control is set to full on the 6080 10GbE interfaces. We
> >>>>>>are
> >>>>>> also
> >>>>>> running Jumbo Frames on all of the involved equipment.
> >>>>>>
> >>>>>> As far as the RHEL OS tweaks go, here are the settings that I have
> >>>>>> changed
> >>>>>> on the system:
> >>>>>>
> >>>>>> ###
> >>>>>> /etc/sysctl.conf:
> >>>>>>
> >>>>>> # 10GbE Kernel Parameters
> >>>>>> net.core.rmem_default = 262144
> >>>>>> net.core.rmem_max = 16777216
> >>>>>> net.core.wmem_default = 262144
> >>>>>> net.core.wmem_max = 16777216
> >>>>>> net.ipv4.tcp_rmem = 4096 262144 <tel:4096%20262144> 16777216
> >>>>>> net.ipv4.tcp_wmem = 4096 262144 <tel:4096%20262144> 16777216
> >>>>>> net.ipv4.tcp_window_scaling = 1
> >>>>>> net.ipv4.tcp_syncookies = 0
> >>>>>> net.ipv4.tcp_timestamps = 0
> >>>>>> net.ipv4.tcp_sack = 0
> >>>>>> #
> >>>>>>
> >>>>>> ###
> >>>>>>
> >>>>>> ###
> >>>>>> /etc/modprobe.d/sunrpc.conf:
> >>>>>>
> >>>>>>
> >>>>>> options sunrpc tcp_slot_table_entries=128
> >>>>>>
> >>>>>> ###
> >>>>>>
> >>>>>>
> >>>>>> ###
> >>>>>> Mount options for the NetApp test NFS share:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,
> >>>>>>s
> >>>>>>ec
> >>>>>> =
> >>>>>> sy
> >>>>>> s
> >>>>>>
> >>>>>> ###
> >>>>>>
> >>>>>> Thanks again for all of your quick and detailed responses!
> >>>>>>
> >>>>>>
> >>>>>> Dan
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 5/19/12 1:08 PM, "Robert McDermott" <rmcdermo [at] fhcrc> wrote:
> >>>>>>
> >>>>>>> Your block size is only 1K; try increasing the block size and the
> >>>>>>> throughput will increase. 1K IOs would generate a lot of IOPs with
> >>>>>>> very
> >>>>>>> little throughput.
> >>>>>>>
> >>>>>>> -Robert
> >>>>>>>
> >>>>>>> Sent from my iPhone
> >>>>>>>
> >>>>>>> On May 19, 2012, at 10:48, Dan Burkland <dburklan [at] NMDP> wrote:
> >>>>>>>
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> My company just bought some Intel x520 10GbE cards which I
> >>>>>>>>recently
> >>>>>>>> installed into our Oracle EBS database servers (IBM 3850 X5s
> >>>>>>>>running
> >>>>>>>> RHEL
> >>>>>>>> 5.8). As the "linux guy" I have been tasked with getting these
> >>>>>>>> servers
> >>>>>>>> to
> >>>>>>>> communicate with our NetApp 6080s via NFS over the new 10GbE
> >>>>>>>>links. I
> >>>>>>>> have
> >>>>>>>> got everything working however ever after tuning the RHEL kernel I
> >>>>>>>>am
> >>>>>>>> only
> >>>>>>>> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
> >>>>>>>> bs=1024
> >>>>>>>> count=5242880" command. For you folks that run 10GbE to your
> >>>>>>>> toasters,
> >>>>>>>> what write speeds are you seeing from your 10GbE connected
> >>>>>>>>servers?
> >>>>>>>> Did
> >>>>>>>> you have to do any tuning in order to get the best results
> >>>>>>>>possible?
> >>>>>>>> If
> >>>>>>>> so
> >>>>>>>> what did you change?
> >>>>>>>>
> >>>>>>>> Thanks!
> >>>>>>>>
> >>>>>>>> Dan
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> Toasters mailing list
> >>>>>>>> Toasters [at] teaparty
> >>>>>>>> http://www.teaparty.net/mailman/listinfo/toasters
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Toasters mailing list
> >>>>>> Toasters [at] teaparty
> >>>>>> http://www.teaparty.net/mailman/listinfo/toasters
> >>>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Toasters mailing list
> >>> Toasters [at] teaparty
> >>> http://www.teaparty.net/mailman/listinfo/toasters
> >
> >
> >_______________________________________________
> >Toasters mailing list
> >Toasters [at] teaparty
> >http://www.teaparty.net/mailman/listinfo/toasters
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters
>
>
>
>
>
>
>
>
> --
> ---
> Gustatus Similis Pullus
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters
>



--
---
Gustatus Similis Pullus


kgraham at industrial-marshmallow

May 29, 2012, 1:03 PM

Post #23 of 23 (6221 views)
Permalink
Re: Poor NFS 10GbE performance on NetApp 6080s [In reply to]

Check your client-side CPU usage. Dalvenjah's earlier mail mentioned it, but at those rates you're largely just testing single-stream TCP throughput and I'd suspect you're choked on interrupt handlers on the client.

If you want to test this, add a second IP to your filer and mount via that, with a workload generator going against each -- the multiple transports should move your numbers up.

This is one place where Oracle's DirectNFS will really help -- by opening a RPC transport per process, you not only avoid static slot allocation (while you're waiting on RHEL 6.3), but also get a healthy number of flows to feed all the interrupt vectors on a MSI-X capable NIC.

[sent from my mobile]

On May 20, 2012, at 3:53 PM, Dan Burkland <dburklan [at] NMDP> wrote:

> Here are the results with Direct I/O enabled:
>
> Run began: Sun May 20 16:21:12 2012
>
> File size set to 5242880 KB
> Record Size 1024 KB
> Excel chart generation enabled
> O_DIRECT feature enabled
> Command line used: iozone -s 5g -r 1m -t 16 -R -b
> /root/iozone_mn4s31063_2012-05-d.csv -I -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9
> t10 t11 t12 t13 t14 t15 t16 t17 t18
> Output is in Kbytes/sec
> Time Resolution = 0.000001 seconds.
> Processor cache size set to 1024 Kbytes.
> Processor cache line size set to 32 bytes.
> File stride size set to 17 * record size.
> Throughput test with 16 processes
> Each process writes a 5242880 Kbyte file in 1024 Kbyte records
>
> Children see throughput for 16 initial writers = 262467.29 KB/sec
> Parent sees throughput for 16 initial writers = 260324.76 KB/sec
> Min throughput per process = 16309.72 KB/sec
> Max throughput per process = 16546.15 KB/sec
> Avg throughput per process = 16404.21 KB/sec
> Min xfer = 5168128.00 KB
>
> Children see throughput for 16 rewriters = 251104.65 KB/sec
> Parent sees throughput for 16 rewriters = 251090.95 KB/sec
> Min throughput per process = 15546.73 KB/sec
> Max throughput per process = 15832.99 KB/sec
> Avg throughput per process = 15694.04 KB/sec
> Min xfer = 5148672.00 KB
>
> Children see throughput for 16 readers = 619751.30 KB/sec
> Parent sees throughput for 16 readers = 619581.97 KB/sec
> Min throughput per process = 36595.70 KB/sec
> Max throughput per process = 39467.45 KB/sec
> Avg throughput per process = 38734.46 KB/sec
> Min xfer = 4861952.00 KB
>
> Children see throughput for 16 re-readers = 626421.73 KB/sec
> Parent sees throughput for 16 re-readers = 626354.38 KB/sec
> Min throughput per process = 37853.47 KB/sec
> Max throughput per process = 40021.52 KB/sec
> Avg throughput per process = 39151.36 KB/sec
> Min xfer = 4959232.00 KB
>
> Children see throughput for 16 reverse readers = 462712.64 KB/sec
> Parent sees throughput for 16 reverse readers = 462649.29 KB/sec
> Min throughput per process = 27713.84 KB/sec
> Max throughput per process = 29794.67 KB/sec
> Avg throughput per process = 28919.54 KB/sec
> Min xfer = 4877312.00 KB
>
> Children see throughput for 16 stride readers = 520482.83 KB/sec
> Parent sees throughput for 16 stride readers = 520448.31 KB/sec
> Min throughput per process = 31892.69 KB/sec
> Max throughput per process = 33016.53 KB/sec
> Avg throughput per process = 32530.18 KB/sec
> Min xfer = 5064704.00 KB
>
> Children see throughput for 16 random readers = 544089.98 KB/sec
> Parent sees throughput for 16 random readers = 544055.32 KB/sec
> Min throughput per process = 33799.79 KB/sec
> Max throughput per process = 34304.76 KB/sec
> Avg throughput per process = 34005.62 KB/sec
> Min xfer = 5166080.00 KB
>
> Children see throughput for 16 mixed workload = 365865.06 KB/sec
> Parent sees throughput for 16 mixed workload = 352394.93 KB/sec
> Min throughput per process = 22250.01 KB/sec
> Max throughput per process = 23576.78 KB/sec
> Avg throughput per process = 22866.57 KB/sec
> Min xfer = 4947968.00 KB
>
> Children see throughput for 16 random writers = 230192.41 KB/sec
> Parent sees throughput for 16 random writers = 229237.34 KB/sec
> Min throughput per process = 14307.92 KB/sec
> Max throughput per process = 14463.50 KB/sec
> Avg throughput per process = 14387.03 KB/sec
> Min xfer = 5186560.00 KB
>
> Children see throughput for 16 pwrite writers = 197020.59 KB/sec
> Parent sees throughput for 16 pwrite writers = 195973.16 KB/sec
> Min throughput per process = 12265.62 KB/sec
> Max throughput per process = 12394.86 KB/sec
> Avg throughput per process = 12313.79 KB/sec
> Min xfer = 5188608.00 KB
>
> Children see throughput for 16 pread readers = 578525.04 KB/sec
> Parent sees throughput for 16 pread readers = 578418.73 KB/sec
> Min throughput per process = 33046.61 KB/sec
> Max throughput per process = 38253.89 KB/sec
> Avg throughput per process = 36157.82 KB/sec
> Min xfer = 4530176.00 KB
>
>
>
> "Throughput report Y-axis is type of test X-axis is number of processes"
> "Record size = 1024 Kbytes "
> "Output is in Kbytes/sec"
>
> " Initial write " 262467.29
>
> " Rewrite " 251104.65
>
> " Read " 619751.30
>
> " Re-read " 626421.73
>
> " Reverse Read " 462712.64
>
> " Stride read " 520482.83
>
> " Random read " 544089.98
>
> " Mixed workload " 365865.06
>
> " Random write " 230192.41
>
> " Pwrite " 197020.59
>
> " Pread " 578525.04
>
> The read results definitely look more believable now. Are these results
> more in line with what I should be seeing? Tomorrow I am going to try and
> rule the switches out of the equation by running "netperf" between my two
> 10GbE test systems.
>
> Dan
>
> From: Jeff Mohler <speedtoys.racing [at] gmail>
> Date: Sun, 20 May 2012 16:17:41 -0500
> To: Dan Burkland <dburklan [at] nmdp>
> Cc: "toasters [at] teaparty" <toasters [at] teaparty>
> Subject: Re: Poor NFS 10GbE performance on NetApp 6080s
>
>
> Re-read is from:
>
> Host system cache
> Netapp system cache (or pam)
>
> Direct will bypass host caching..yup.
>
>
>
> On Sun, May 20, 2012 at 2:12 PM, Dan Burkland <dburklan [at] nmdp> wrote:
>
> In regards to the latest "iozone" results, are these more in the ball park
> of what I should be seeing? Also why is the re-read throughput value
> roughly 20x that of the initial read speed? Would this be caching on the
> NFS client side or some sort of caching done by the PAM card on the 6080?
> (Should I be running these tests with the "-I" or "Direct IO" argument to
> bypass any possible local caching mechanisms?"
>
> Thanks again!
>
> Dan
>
>
> On 5/19/12 5:32 PM, "Dan Burkland" <dburklan [at] NMDP> wrote:
>
>> I unmounted the NFS share and rebooted the box before running the same
>> "iozone" command again. This time I let "iozone" run through all of its
>> test (including the read-based ones)
>>
>>
>> Run began: Sat May 19 16:46:27 2012
>>
>> File size set to 5242880 KB
>> Record Size 1024 KB
>> Excel chart generation enabled
>> Command line used: iozone -s 5g -r 1m -t 16 -R -b
>> /root/iozone_mn4s31063_2012-05-d.csv -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
>> t11 t12 t13 t14 t15 t16 t17 t18
>> Output is in Kbytes/sec
>> Time Resolution = 0.000001 seconds.
>> Processor cache size set to 1024 Kbytes.
>> Processor cache line size set to 32 bytes.
>> File stride size set to 17 * record size.
>> Throughput test with 16 processes
>> Each process writes a 5242880 Kbyte file in 1024 Kbyte records
>>
>> Children see throughput for 16 initial writers = 349500.55
>> KB/sec
>> Parent sees throughput for 16 initial writers = 173837.26
>> KB/sec
>> Min throughput per process = 21147.24
>> KB/sec
>> Max throughput per process = 22701.06
>> KB/sec
>> Avg throughput per process = 21843.78
>> KB/sec
>> Min xfer = 4884480.00 KB
>>
>> Children see throughput for 16 rewriters = 372333.90
>> KB/sec
>> Parent sees throughput for 16 rewriters = 179256.38
>> KB/sec
>> Min throughput per process = 22495.20
>> KB/sec
>> Max throughput per process = 24418.89
>> KB/sec
>> Avg throughput per process = 23270.87
>> KB/sec
>> Min xfer = 4830208.00 KB
>>
>> Children see throughput for 16 readers = 440115.98
>> KB/sec
>> Parent sees throughput for 16 readers = 439993.44
>> KB/sec
>> Min throughput per process = 26406.17
>> KB/sec
>> Max throughput per process = 28724.05
>> KB/sec
>> Avg throughput per process = 27507.25
>> KB/sec
>> Min xfer = 4819968.00 KB
>>
>> Children see throughput for 16 re-readers = 8953522.06
>> KB/sec
>> Parent sees throughput for 16 re-readers = 8930475.33
>> KB/sec
>> Min throughput per process = 408033.34
>> KB/sec
>> Max throughput per process = 671821.62
>> KB/sec
>> Avg throughput per process = 559595.13
>> KB/sec
>> Min xfer = 3186688.00 KB
>>
>> Children see throughput for 16 reverse readers = 5543829.37
>> KB/sec
>> Parent sees throughput for 16 reverse readers = 5425986.47
>> KB/sec
>> Min throughput per process = 15684.29
>> KB/sec
>> Max throughput per process = 2261884.25
>> KB/sec
>> Avg throughput per process = 346489.34
>> KB/sec
>> Min xfer = 36864.00 KB
>>
>> Children see throughput for 16 stride readers = 16532117.19
>> KB/sec
>> Parent sees throughput for 16 stride readers = 16272131.55
>> KB/sec
>> Min throughput per process = 257097.92
>> KB/sec
>> Max throughput per process = 2256125.75
>> KB/sec
>> Avg throughput per process = 1033257.32
>> KB/sec
>> Min xfer = 602112.00 KB
>>
>> Children see throughput for 16 random readers = 17297437.81
>> KB/sec
>> Parent sees throughput for 16 random readers = 16871312.92
>> KB/sec
>> Min throughput per process = 320909.25
>> KB/sec
>> Max throughput per process = 2083737.75
>> KB/sec
>> Avg throughput per process = 1081089.86
>> KB/sec
>> Min xfer = 826368.00 KB
>>
>> Children see throughput for 16 mixed workload = 10747970.97
>> KB/sec
>> Parent sees throughput for 16 mixed workload = 112898.07
>> KB/sec
>> Min throughput per process = 54960.62
>> KB/sec
>> Max throughput per process = 1991637.38
>> KB/sec
>> Avg throughput per process = 671748.19
>> KB/sec
>> Min xfer = 145408.00 KB
>>
>> Children see throughput for 16 random writers = 358103.29
>> KB/sec
>> Parent sees throughput for 16 random writers = 166805.09
>> KB/sec
>> Min throughput per process = 21263.60
>> KB/sec
>> Max throughput per process = 22942.70
>> KB/sec
>> Avg throughput per process = 22381.46
>> KB/sec
>> Min xfer = 4859904.00 KB
>>
>> Children see throughput for 16 pwrite writers = 325666.64
>> KB/sec
>> Parent sees throughput for 16 pwrite writers = 177771.50
>> KB/sec
>> Min throughput per process = 19902.90
>> KB/sec
>> Max throughput per process = 20863.29
>> KB/sec
>> Avg throughput per process = 20354.17
>> KB/sec
>> Min xfer = 5008384.00 KB
>>
>> Children see throughput for 16 pread readers = 445021.47
>> KB/sec
>> Parent sees throughput for 16 pread readers = 444618.25
>> KB/sec
>> Min throughput per process = 26932.47
>> KB/sec
>> Max throughput per process = 28361.61
>> KB/sec
>> Avg throughput per process = 27813.84
>> KB/sec
>> Min xfer = 4981760.00 KB
>>
>>
>>
>> "Throughput report Y-axis is type of test X-axis is number of processes"
>> "Record size = 1024 Kbytes "
>> "Output is in Kbytes/sec"
>>
>> " Initial write " 349500.55
>>
>> " Rewrite " 372333.90
>>
>> " Read " 440115.98
>>
>> " Re-read " 8953522.06
>>
>> " Reverse Read " 5543829.37
>>
>> " Stride read " 16532117.19
>>
>> " Random read " 17297437.81
>>
>> " Mixed workload " 10747970.97
>>
>> " Random write " 358103.29
>>
>> " Pwrite " 325666.64
>>
>> " Pread " 445021.47
>>
>>
>> Regards,
>>
>> Dan
>>
>>
>> On 5/19/12 4:48 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:
>>
>>> You're now approaching storage write saturation for your box on writes at
>>> that rate.
>>>
>>> Pull reads now.
>>>
>>>
>>>
>>> Sent from my iPhone
>>>
>>> On May 19, 2012, at 2:43 PM, Dan Burkland <dburklan [at] NMDP> wrote:
>>>
>>>> Here are the IOZone results:
>>>>
>>>> Run began: Sat May 19 16:22:46 2012
>>>>
>>>> File size set to 5242880 KB
>>>> Record Size 1024 KB
>>>> Excel chart generation enabled
>>>> Command line used: iozone -s 5g -r 1m -t 16 -R -b
>>>> /root/iozone_mn4s31063_2012-05-d.csv -F tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9
>>>> t10
>>>> t11 t12 t13 t14 t15 t16 t17 t18
>>>> Output is in Kbytes/sec
>>>> Time Resolution = 0.000001 seconds.
>>>> Processor cache size set to 1024 Kbytes.
>>>> Processor cache line size set to 32 bytes.
>>>> File stride size set to 17 * record size.
>>>> Throughput test with 16 processes
>>>> Each process writes a 5242880 Kbyte file in 1024 Kbyte records
>>>>
>>>> Children see throughput for 16 initial writers = 371306.91
>>>> KB/sec
>>>> Parent sees throughput for 16 initial writers = 167971.82 KB/sec
>>>> Min throughput per process = 21901.84 KB/sec
>>>> Max throughput per process = 25333.62 KB/sec
>>>> Avg throughput per process = 23206.68 KB/sec
>>>> Min xfer = 4533248.00 KB
>>>>
>>>> Children see throughput for 16 rewriters = 350486.11 KB/sec
>>>> Parent sees throughput for 16 rewriters = 176947.47 KB/sec
>>>> Min throughput per process = 21154.26 KB/sec
>>>> Max throughput per process = 23011.69 KB/sec
>>>> Avg throughput per process = 21905.38 KB/sec
>>>> Min xfer = 4819968.00 KB
>>>>
>>>> 362MB/s looks quite a bit higher however can somebody validate that I
>>>> am
>>>> reading these results correctly? Should I also run "iozone" with the -a
>>>> (auto) option for good measure?
>>>>
>>>> Thanks again for all of your responses, I greatly appreciate it!
>>>>
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 5/19/12 4:36 PM, "Dan Burkland" <dburklan [at] NMDP> wrote:
>>>>
>>>>> Jeff Mother - Which specific setting are you referring to?
>>>>>
>>>>> I installed iozone on my test machine and am currently running the
>>>>> following iozone command on it:
>>>>>
>>>>> iozone -s 5g -r 1m -t 16 -R -b /root/iozone_testserver_2012-05-d.csv
>>>>> -F
>>>>> tf1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18
>>>>>
>>>>> I'll post the results once it is finished
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>
>>>>> On 5/19/12 2:44 PM, "Jeff Mother" <speedtoys.racing [at] gmail> wrote:
>>>>>
>>>>>> Easy one.
>>>>>>
>>>>>> If it went down in half, adjust your kernel tcp slot count.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Sent from my iPhone
>>>>>>
>>>>>> On May 19, 2012, at 11:46 AM, Dan Burkland <dburklan [at] NMDP> wrote:
>>>>>>
>>>>>>> I know dd isn't the best tool since it is a single threaded
>>>>>>> application
>>>>>>> and in no way represents the workload that Oracle will impose.
>>>>>>> However,
>>>>>>> I
>>>>>>> thought it would still give me a decent ballpark figure regarding
>>>>>>> throughput. I tried a block size of 64k, 128k, and 1M (just to see)
>>>>>>> and
>>>>>>> got a bit more promising results:
>>>>>>>
>>>>>>> # dd if=/dev/zero of=/mnt/testfile bs=1M count=5120
>>>>>>> 5120+0 records in
>>>>>>> 5120+0 records out
>>>>>>> 5368709120 bytes (5.4 GB) copied, 26.6878 seconds, 201 MB/s
>>>>>>>
>>>>>>> If I run two of these dd sessions at once the throughput figure
>>>>>>> above
>>>>>>> gets
>>>>>>> cut in half (each dd session reports it creates the file at around
>>>>>>> 100MB/s).
>>>>>>>
>>>>>>> As far as the switch goes, I have not checked it yet however I did
>>>>>>> notice
>>>>>>> that flow control is set to full on the 6080 10GbE interfaces. We
>>>>>>> are
>>>>>>> also
>>>>>>> running Jumbo Frames on all of the involved equipment.
>>>>>>>
>>>>>>> As far as the RHEL OS tweaks go, here are the settings that I have
>>>>>>> changed
>>>>>>> on the system:
>>>>>>>
>>>>>>> ###
>>>>>>> /etc/sysctl.conf:
>>>>>>>
>>>>>>> # 10GbE Kernel Parameters
>>>>>>> net.core.rmem_default = 262144
>>>>>>> net.core.rmem_max = 16777216
>>>>>>> net.core.wmem_default = 262144
>>>>>>> net.core.wmem_max = 16777216
>>>>>>> net.ipv4.tcp_rmem = 4096 262144 <tel:4096%20262144> 16777216
>>>>>>> net.ipv4.tcp_wmem = 4096 262144 <tel:4096%20262144> 16777216
>>>>>>> net.ipv4.tcp_window_scaling = 1
>>>>>>> net.ipv4.tcp_syncookies = 0
>>>>>>> net.ipv4.tcp_timestamps = 0
>>>>>>> net.ipv4.tcp_sack = 0
>>>>>>> #
>>>>>>>
>>>>>>> ###
>>>>>>>
>>>>>>> ###
>>>>>>> /etc/modprobe.d/sunrpc.conf:
>>>>>>>
>>>>>>>
>>>>>>> options sunrpc tcp_slot_table_entries=128
>>>>>>>
>>>>>>> ###
>>>>>>>
>>>>>>>
>>>>>>> ###
>>>>>>> Mount options for the NetApp test NFS share:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,
>>>>>>> s
>>>>>>> ec
>>>>>>> =
>>>>>>> sy
>>>>>>> s
>>>>>>>
>>>>>>> ###
>>>>>>>
>>>>>>> Thanks again for all of your quick and detailed responses!
>>>>>>>
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 5/19/12 1:08 PM, "Robert McDermott" <rmcdermo [at] fhcrc> wrote:
>>>>>>>
>>>>>>>> Your block size is only 1K; try increasing the block size and the
>>>>>>>> throughput will increase. 1K IOs would generate a lot of IOPs with
>>>>>>>> very
>>>>>>>> little throughput.
>>>>>>>>
>>>>>>>> -Robert
>>>>>>>>
>>>>>>>> Sent from my iPhone
>>>>>>>>
>>>>>>>> On May 19, 2012, at 10:48, Dan Burkland <dburklan [at] NMDP> wrote:
>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> My company just bought some Intel x520 10GbE cards which I
>>>>>>>>> recently
>>>>>>>>> installed into our Oracle EBS database servers (IBM 3850 X5s
>>>>>>>>> running
>>>>>>>>> RHEL
>>>>>>>>> 5.8). As the "linux guy" I have been tasked with getting these
>>>>>>>>> servers
>>>>>>>>> to
>>>>>>>>> communicate with our NetApp 6080s via NFS over the new 10GbE
>>>>>>>>> links. I
>>>>>>>>> have
>>>>>>>>> got everything working however ever after tuning the RHEL kernel I
>>>>>>>>> am
>>>>>>>>> only
>>>>>>>>> getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
>>>>>>>>> bs=1024
>>>>>>>>> count=5242880" command. For you folks that run 10GbE to your
>>>>>>>>> toasters,
>>>>>>>>> what write speeds are you seeing from your 10GbE connected
>>>>>>>>> servers?
>>>>>>>>> Did
>>>>>>>>> you have to do any tuning in order to get the best results
>>>>>>>>> possible?
>>>>>>>>> If
>>>>>>>>> so
>>>>>>>>> what did you change?
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Toasters mailing list
>>>>>>>>> Toasters [at] teaparty
>>>>>>>>> http://www.teaparty.net/mailman/listinfo/toasters
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Toasters mailing list
>>>>>>> Toasters [at] teaparty
>>>>>>> http://www.teaparty.net/mailman/listinfo/toasters
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Toasters mailing list
>>>> Toasters [at] teaparty
>>>> http://www.teaparty.net/mailman/listinfo/toasters
>>
>>
>> _______________________________________________
>> Toasters mailing list
>> Toasters [at] teaparty
>> http://www.teaparty.net/mailman/listinfo/toasters
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters
>
>
>
>
>
>
>
>
> --
> ---
> Gustatus Similis Pullus
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters

Netapp toasters RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.