Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Netapp: toasters

How to balance volume priority (some NFS vs. CIFS)

 

 

Netapp toasters RSS feed   Index | Next | Previous | View Threaded


mcdouga9 at egr

Feb 8, 2009, 6:19 PM

Post #1 of 6 (3116 views)
Permalink
How to balance volume priority (some NFS vs. CIFS)

I have not opened a case with Netapp yet but probably will if no one has
any good ideas; I just like to pick people's brains before going
official. Thanks for any input.

A few months ago we moved a file share off a Windows server onto our
FAS3040 Netapp running 7.2.4 and shared it out via CIFS. It contains
software install files and scripts and depending on scheduled jobs, it
can get hit pretty hard and pushes out approximately 1 Gbit/sec, which
has been drastically affecting the service times for our other shares on
that filer, and its namely response-sensitive NFS shares we care about
that are affected the most such as mail and web files. It doesn't
really seem to be a disk bottleneck because the disk read/sec in sysstat
is usually only half of what the filer is pushing out to the network, so
I assume its reading some data from cache. The CIFS software install
share can either get hit by 1-60+ CIFS clients where each client reads
files on and off for hours at a time, or sometimes we have hundreds of
clients hitting the share at once for a smaller set of files (such as to
update one software package across a large set of PCs). I've been able
to reproduce the slowdown with just 4 CIFS clients on gigabit
downloading a large file from the share. Sometimes it only causes a
modest slowdown in the NFS response time but sometimes email messages
being moved between folders will stall for 8 seconds or much more, which
is pretty much unacceptable. I don't think its a bottleneck in my core
network because I've done tests where the slow nfs client is on the same
switch as the filer, which is connected via two gig links using LACP.
Also, in the normal situation where the slowdown is encountered, mail
(NFS) traffic is flowing through a different gig uplink than the hungry
CIFS clients.

Goal: reduce the impact of greedy clients (primarily known ones, but
hopefully unexpected ones too) on the response time of the rest of the
filer's clients. I don't care if the CIFS software share must accept
slower data rates, and I'd rather not run away from the problem by
avoiding it but rather learn what I can do to prevent my filer from
being held hostage by greedy clients. I do have another 3040 I could
move the share to, but that filer also has volumes that would be
affected negatively in the same way, and I'd rather not concede defeat
and go back to hosting the share on a dedicated windows server. I can
try different code versions in a test environment if I need to, but I'd
like to think this kind of situation would have come up already and have
a solution at hand.

I've played around with na_priority trying to set the mail and website
volumes to high or veryhigh priority and the software share to low or
verylow but that isn't making a measurable impact. I'm not really sure
what to tweak or check next.

Here is an example from sysstat when I am simulating the slowdown
condition with 4 CIFS clients on gigabit fetching the same file.

CPU NFS CIFS HTTP Net kB/s Disk kB/s Tape kB/s
Cache
in out read write read
write age
6% 2058 167 0 751 1543 2196 0 0
0 11
6% 2590 164 0 699 2238 2904 32 0
0 11
10% 2183 223 0 1241 4471 5072 17872 0
0 11
11% 3299 799 0 1577 22194 4935 1183 0
0 11
22% 3298 3072 0 3005 107869 9128 24 0
0 11
18% 2532 1986 0 2270 87651 2078 0 0
0 11
18% 2198 2200 0 1696 105941 8032 8 0
0 11
16% 3597 1650 0 1890 84691 3528 24 0
0 11
23% 4946 2216 0 2604 112741 14664 0 0
0 11
22% 4075 2041 0 2324 100380 21568 0 0
0 11
CPU NFS CIFS HTTP Net kB/s Disk kB/s Tape kB/s
Cache
in out read write read
write age
21% 3272 2246 0 2862 115380 4688 24 0
0 11
21% 4117 2092 0 2686 109165 3864 8 0
0 11
26% 4188 2136 0 3436 115081 21900 0 0
0 11
......(skip)
30% 7487 1773 0 4261 93385 10156 3328 0
0 6
25% 4566 1900 0 3339 96655 13764 9808 0
0 7
24% 2965 2202 0 2477 111493 11772 5475 0
0 8
23% 5256 1986 0 3093 102409 10508 24 0
0 8
19% 2979 2068 0 1810 102282 9926 0 0
0 8
20% 3164 2323 0 2301 111209 1560 8 0
0 8
23% 7082 2165 0 2322 103816 2292 24 0
0 8
22% 11780 1158 0 2763 55501 1760 0 0
0 8
20% 12032 675 0 3820 36504 2452 0 0
0 8
CPU NFS CIFS HTTP Net kB/s Disk kB/s Tape kB/s
Cache
in out read write read
write age
23% 16269 1122 0 3914 54034 4460 24 0
0 6
18% 8991 1030 0 2739 48400 4568 8 0
0 6
10% 3903 237 0 1346 4494 3828 0 0
0 6
11% 3912 219 0 1623 4301 3808 6508 0
0 6
8% 2402 224 0 868 2027 2744 8712 0
0 6


pat.breen at netapp

Feb 8, 2009, 6:47 PM

Post #2 of 6 (2994 views)
Permalink
Re: How to balance volume priority (some NFS vs. CIFS) [In reply to]

Adam McDougall wrote:
>
> Goal: reduce the impact of greedy clients (primarily known ones, but
> hopefully unexpected ones too) on the response time of the rest of the
> filer's clients. I don't care if the CIFS software share must accept

Adam -

I'd suggest taking a look at FlexShare (available since 7.2.x at
no additional cost) which has been developed for exactly this
problem.

It ONLY kicks in when there is contention of resources (eg. CPU,
memory)

Prioritise the NFS workloads to high, and either leave the
CIFS workload as is or set to low.

Regards,


Pat


Greck.Cannon at netapp

Feb 8, 2009, 7:39 PM

Post #3 of 6 (3000 views)
Permalink
Re: How to balance volume priority (some NFS vs. CIFS) [In reply to]

Exactly what FlexShare is for... just keep in mind that disk iops
*are* restricted based on the prioritization regardless of the load
(it's a non-work-conserving queue), so monitor the appropriate
statistics (I'm research inhibited at the moment, but
prioqueue:usr_wait_msecs is close) to make sure things aren't waiting
unnecessarily if there is still bandwidth to disk available.

--greck

On Feb 8, 2009, at 6:55 PM, "Pat Breen" <pat.breen [at] netapp> wrote:

> Adam McDougall wrote:
>> Goal: reduce the impact of greedy clients (primarily known ones, but
>> hopefully unexpected ones too) on the response time of the rest of
>> the
>> filer's clients. I don't care if the CIFS software share must accept
>
> Adam -
>
> I'd suggest taking a look at FlexShare (available since 7.2.x at
> no additional cost) which has been developed for exactly this
> problem.
>
> It ONLY kicks in when there is contention of resources (eg. CPU,
> memory)
>
> Prioritise the NFS workloads to high, and either leave the
> CIFS workload as is or set to low.
>
> Regards,
>
>
> Pat


mcdouga9 at egr

Feb 8, 2009, 7:39 PM

Post #4 of 6 (2997 views)
Permalink
Re: How to balance volume priority (some NFS vs. CIFS) [In reply to]

Pat Breen wrote:
> Adam McDougall wrote:
>>
>> Goal: reduce the impact of greedy clients (primarily known ones, but
>> hopefully unexpected ones too) on the response time of the rest of the
>> filer's clients. I don't care if the CIFS software share must accept
>
> Adam -
>
> I'd suggest taking a look at FlexShare (available since 7.2.x at
> no additional cost) which has been developed for exactly this
> problem.
>
> It ONLY kicks in when there is contention of resources (eg. CPU,
> memory)
>
> Prioritise the NFS workloads to high, and either leave the
> CIFS workload as is or set to low.
>
> Regards,
>
>
> Pat
>
As I understand it, FlexShare is the same thing as na_priority which I
already tried with no obvious results. I wondered if I might need to
restart CIFS or anything else to activate the changes; I "enabled"
priority and set some priorities. "win" is the software share I spoke of.

> priority show volume
Volume Priority Relative Sys Priority
Service Priority (vs User)
home on High Low
mail on VeryHigh Low
scratch on VeryLow Low
sites on High Low
win on VeryLow Medium


pascal.dukers at asml

Feb 8, 2009, 10:54 PM

Post #5 of 6 (2991 views)
Permalink
Re: How to balance volume priority (some NFS vs. CIFS) [In reply to]

Adam McDougall wrote:
>
> Here is an example from sysstat when I am simulating the slowdown
> condition with 4 CIFS clients on gigabit fetching the same file.
>

Are the nfs and cifs clients on the same 1 Gbit link?

I see that you are pushing over 100 MB/s over the network and if that is
only 1 link then that seems to me to be reason for the slow response times.
My advice would be to put the nfs and cifs traffic on different 1 Gbit
links.

--
View this message in context: http://www.nabble.com/How-to-balance-volume-priority-%28some-NFS-vs.-CIFS%29-tp21906282p21907807.html
Sent from the Network Appliance - Toasters mailing list archive at Nabble.com.


Errol.Fouquet at netapp

Feb 9, 2009, 6:31 AM

Post #6 of 6 (2979 views)
Permalink
RE: How to balance volume priority (some NFS vs. CIFS) [In reply to]

As long as the CIFS share and the NFS exports are not in the same volume
... FlexShare may be perfect for your needs.

With respect to what Greck was talking about ... in order to determine
if disk iops are being limited by FlexShare, you'll want to use "stats"
and observe the priorityqueue object, and pay attention to:
priorityqueue:(default):usr_read_limit_hit:0 <-- user disk iops
priorityqueue:(default):sys_read_limit_hit:0 <-- system disk iops

If those values become non-zero, you'll want to increase the global
"io_concurrency":

-- defaults to 8, max is 1024.


BTW, you need to be in advanced priv for this object:

filer*> stats start -I foo priorityqueue
wait 30 seconds or so
filer*> starts stop -I foo



-----Original Message-----
From: Cannon, Greck
Sent: Sunday, February 08, 2009 9:40 PM
To: Breen, Pat
Cc: Adam McDougall; toasters [at] mathworks
Subject: Re: How to balance volume priority (some NFS vs. CIFS)

Exactly what FlexShare is for... just keep in mind that disk iops
*are* restricted based on the prioritization regardless of the load
(it's a non-work-conserving queue), so monitor the appropriate
statistics (I'm research inhibited at the moment, but
prioqueue:usr_wait_msecs is close) to make sure things aren't waiting
unnecessarily if there is still bandwidth to disk available.

--greck

On Feb 8, 2009, at 6:55 PM, "Pat Breen" <pat.breen [at] netapp> wrote:

> Adam McDougall wrote:
>> Goal: reduce the impact of greedy clients (primarily known ones, but
>> hopefully unexpected ones too) on the response time of the rest of
>> the
>> filer's clients. I don't care if the CIFS software share must accept
>
> Adam -
>
> I'd suggest taking a look at FlexShare (available since 7.2.x at
> no additional cost) which has been developed for exactly this
> problem.
>
> It ONLY kicks in when there is contention of resources (eg. CPU,
> memory)
>
> Prioritise the NFS workloads to high, and either leave the
> CIFS workload as is or set to low.
>
> Regards,
>
>
> Pat

Netapp toasters RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.