Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Netapp: toasters

Distribute aggregate across shelves or limit to one shelf?

 

 

Netapp toasters RSS feed   Index | Next | Previous | View Threaded


rvandolson at esri

Apr 7, 2011, 12:58 PM

Post #1 of 5 (2782 views)
Permalink
Distribute aggregate across shelves or limit to one shelf?

Best practice (based on my reading of the archives) seems to be to
distribute disk membership in an aggregate across disk shelves.

This would appear to be for performance reasons primarily (less chance
of saturating a shelf's "uplink" to the controller), but how does it
affect reliability?

If I limit myself to one aggregate per shelf, if I lose that shelf I
lose only the one aggregate. If aggregates are distributed I could
lose all of them.

My thought is that the chance of the shelf failing is actually pretty
slim as its hardware isn't all that sophisticated.

And obviously there are performance penalties for limiting to one
aggregate per shelf (disk count maximums).

Thanks,
Ray


jkennedy at qualcomm

Apr 7, 2011, 2:16 PM

Post #2 of 5 (2699 views)
Permalink
RE: Distribute aggregate across shelves or limit to one shelf? [In reply to]

Back in the R100 days it was *required* to spread the volumes, in a very specific order, vertically across shelves (volumes at that time were the aggregates of today). The logic is still sound from a performance view, but keep in mind that if your spares are spread randomly every time you lose a drive your perfect drive configuration degrades.

However, look at your current IO requirements and decide. Shelves today have 24 15k drives in them with 3Gb or more on the loop. You need an awful lot of to saturate that. Certainly it can be done but if you're filer is generally hovering around 20k ops or a couple hundred MB throughput, it's unlikely you're going to saturate any single new shelf. Let alone if that IO is over multiple shelves.

If you're talking about DS14's, the numbers drop notably of course but the same logic applies. Maybe with those numbers drop to 8k ops or 50MB throughput. Not sure.

With multi-pathing the odds of losing a loop completely are very, very low. Personally I just let WAFL decide where to grab drives from; haven't cared about that level of control for a long time now...

Jeff Kennedy
Qualcomm, Incorporated
QCT Engineering Compute
858-651-6592


-----Original Message-----
From: owner-toasters [at] mathworks [mailto:owner-toasters [at] mathworks] On Behalf Of Ray Van Dolson
Sent: Thursday, April 07, 2011 12:58 PM
To: toasters [at] mathworks
Subject: Distribute aggregate across shelves or limit to one shelf?

Best practice (based on my reading of the archives) seems to be to
distribute disk membership in an aggregate across disk shelves.

This would appear to be for performance reasons primarily (less chance
of saturating a shelf's "uplink" to the controller), but how does it
affect reliability?

If I limit myself to one aggregate per shelf, if I lose that shelf I
lose only the one aggregate. If aggregates are distributed I could
lose all of them.

My thought is that the chance of the shelf failing is actually pretty
slim as its hardware isn't all that sophisticated.

And obviously there are performance penalties for limiting to one
aggregate per shelf (disk count maximums).

Thanks,
Ray


bundy at usage

Apr 7, 2011, 11:17 PM

Post #3 of 5 (2692 views)
Permalink
Re: Distribute aggregate across shelves or limit to one shelf? [In reply to]

Whatever you do, there is always a tradeoff between reliability,
performance and efficiency. I think all concerns are well answered in
NetApps storage resiliency paper: Do Raid-DP, do backups, do HA, do
multi-pathing, do disk auto assign, have spare parts (etc) and you'll
have 99,999% availability and a max of performance, as long as you can
afford it.

And btw., limiting you and your system to have 1 aggregate/shelf
produces a lot of work reassigning disks/spares over time when your
systems grows. After 10+ years of operating netapp systems, I always had
my aggregates (volumes earlier) spread across all shelves and never had
any trouble with it.

-SF



Am 07.04.2011 21:58, schrieb Ray Van Dolson:

> Best practice (based on my reading of the archives) seems to be to
> distribute disk membership in an aggregate across disk shelves.
>
> This would appear to be for performance reasons primarily (less chance
> of saturating a shelf's "uplink" to the controller), but how does it
> affect reliability?
>
> If I limit myself to one aggregate per shelf, if I lose that shelf I
> lose only the one aggregate. If aggregates are distributed I could
> lose all of them.
>
> My thought is that the chance of the shelf failing is actually pretty
> slim as its hardware isn't all that sophisticated.
>
> And obviously there are performance penalties for limiting to one
> aggregate per shelf (disk count maximums).


HollandWL at state

Apr 8, 2011, 2:59 AM

Post #4 of 5 (2691 views)
Permalink
RE: Distribute aggregate across shelves or limit to one shelf? [In reply to]

Definitely want to spread across multiple shelves if possible. Not only
does this give you better performance as the workload is distributed
across the shelves, but it does actually give you better protection in
the event of a shelf loss. We had a shelf failure on a system that had
10 aggregates distributed across 12 shelves. The system panic'd and
rebooted and performance was horrible as it saw several aggregates with
double drive failures and tried to rebuild them (ran out of online
spares in the process) but we lost no data at all. NetApp's support was
great about getting additional spare drives and a replacement shelf
quickly. It took a couple of days for performance to return to 100% as
it continued rebuilding the aggregates, but we lost no data and kept the
system online during the rebuild after replacing the shelf.

This email is UNCLASSIFIED

-----Original Message-----
From: owner-toasters [at] mathworks [mailto:owner-toasters [at] mathworks]
On Behalf Of Ray Van Dolson
Sent: Thursday, April 07, 2011 3:58 PM
To: toasters [at] mathworks
Subject: Distribute aggregate across shelves or limit to one shelf?

Best practice (based on my reading of the archives) seems to be to
distribute disk membership in an aggregate across disk shelves.

This would appear to be for performance reasons primarily (less chance
of saturating a shelf's "uplink" to the controller), but how does it
affect reliability?

If I limit myself to one aggregate per shelf, if I lose that shelf I
lose only the one aggregate. If aggregates are distributed I could lose
all of them.

My thought is that the chance of the shelf failing is actually pretty
slim as its hardware isn't all that sophisticated.

And obviously there are performance penalties for limiting to one
aggregate per shelf (disk count maximums).

Thanks,
Ray


jeremy.page at gilbarco

Apr 8, 2011, 6:34 AM

Post #5 of 5 (2689 views)
Permalink
RE: Distribute aggregate across shelves or limit to one shelf? [In reply to]

I'd second Stefan's comment. User error is generally the cause of
downtime, not hardware outages (assuming a reasonably architected sytem
from a redundancy standpoint). Complicating things generally makes you
more prone to outages, not less.

-----Original Message-----
From: owner-toasters [at] mathworks [mailto:owner-toasters [at] mathworks]
On Behalf Of Stefan Funke
Sent: Friday, April 08, 2011 2:18 AM
To: toasters [at] mathworks
Subject: Re: Distribute aggregate across shelves or limit to one shelf?

Whatever you do, there is always a tradeoff between reliability,
performance and efficiency. I think all concerns are well answered in
NetApps storage resiliency paper: Do Raid-DP, do backups, do HA, do
multi-pathing, do disk auto assign, have spare parts (etc) and you'll
have 99,999% availability and a max of performance, as long as you can
afford it.

And btw., limiting you and your system to have 1 aggregate/shelf
produces a lot of work reassigning disks/spares over time when your
systems grows. After 10+ years of operating netapp systems, I always had
my aggregates (volumes earlier) spread across all shelves and never had
any trouble with it.

-SF



Am 07.04.2011 21:58, schrieb Ray Van Dolson:

> Best practice (based on my reading of the archives) seems to be to
> distribute disk membership in an aggregate across disk shelves.
>
> This would appear to be for performance reasons primarily (less chance

> of saturating a shelf's "uplink" to the controller), but how does it
> affect reliability?
>
> If I limit myself to one aggregate per shelf, if I lose that shelf I
> lose only the one aggregate. If aggregates are distributed I could
> lose all of them.
>
> My thought is that the chance of the shelf failing is actually pretty
> slim as its hardware isn't all that sophisticated.
>
> And obviously there are performance penalties for limiting to one
> aggregate per shelf (disk count maximums).


Please be advised that this email may contain confidential
information. If you are not the intended recipient, please notify us
by email by replying to the sender and delete this message. The
sender disclaims that the content of this email constitutes an offer
to enter into, or the acceptance of, any agreement; provided that the
foregoing does not invalidate the binding effect of any digital or
other electronic reproduction of a manual signature that is included
in any attachment.

Netapp toasters RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.