Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Netapp: toasters

FAS2050C questions (clustering)

 

 

Netapp toasters RSS feed   Index | Next | Previous | View Threaded


rvandolson at esri

Jun 2, 2009, 8:43 AM

Post #1 of 11 (2137 views)
Permalink
FAS2050C questions (clustering)

I'm the proud new owner of an IBM N3600 A20 (rebranded FAS2050C) with
20x30GB SAS disks.

I'm trying to determine the best way to get this thing set up, and
realized I have only a bit of a fuzzy understanding as to how the
clustering or failover filer head should work.

My initial thoughts were to aim for the following setup:

- Set up all 20 disks in a RAID-DP aggregate with one spare (17 data,
2 parity and one spare, or maybe two spares).
- Bond a NIC from the first controller with a NIC from the second
controller to give us a 2Gbps connection to our "storage network".
- Third and fourth NIC's would go to our regular network.

My hope was that I could lose one filer head and the other would take
over seamlessly. We'd lose half of our network bandwidth but still be
up and running.

However, it sounds like my understanding of how the clustering works
might have been a bit flawed and that I actually need to treat the
filer heads as two separate filers. So I may be forced to do something
like the following:

- Split my disks up between the two filers (7 data, 2 parity, one
spare -- or maybe I can have one spare available to both heads).
- Probably can't team NIC's from multiple filer heads meaning if I
team the two NIC's on the filer I can no longer connect to my
management network. I probably need to order more NIC's :(
- If I lose one head, I lose one aggregate unless manual intervention
is taken.
- Each filer has a different hostname/IP for network access.

This maybe gives me better performance, but at the expense of total
disk space and flexibility if my understanding is correct.

Maybe someone could help clear this up. It doesn't appear IBM has a
RedBook on clustering... I'm searching around in NOW and have come
across the Data ONTAP 7.3 Active/Active Configuration Guide which I am
now reading.

Is there something similar for Active/Passive setups (which seems to be
more what I am after) or other documents that would be recommended
reading? Any advice or best practices?

This filer will be serving NFS to a pair of ESX servers. We plan to
add a second shelf of disks later this year.

Thanks in advance. No sales inquiries please.

Ray


Adam.Fox at netapp

Jun 2, 2009, 8:54 AM

Post #2 of 11 (2049 views)
Permalink
RE: FAS2050C questions (clustering) [In reply to]

You are correct that clustering is treated as two separate controllers
which can take over for each other. You cannot vif across NICs on
different controllers.

If you want to do the closest thing to active/passive would be to
allocate at least 2 (possible 3 if you want a spare) disks to the
"passive" controller and the rest to the active one. I'd set up a raid4
trad vol or aggregate for it since you only are going to use 2 disks,
you don't need raid_dp. Definitely use raid_dp on the active
controller.

Under this scenario, you can lose either controller head and still be
running.

-- Adam Fox
Systems Engineer
adamfox[at]netapp.com

-----Original Message-----
From: Ray Van Dolson [mailto:rvandolson[at]esri.com]
Sent: Tuesday, June 02, 2009 11:44 AM
To: toasters[at]mathworks.com
Subject: FAS2050C questions (clustering)

I'm the proud new owner of an IBM N3600 A20 (rebranded FAS2050C) with
20x30GB SAS disks.

I'm trying to determine the best way to get this thing set up, and
realized I have only a bit of a fuzzy understanding as to how the
clustering or failover filer head should work.

My initial thoughts were to aim for the following setup:

- Set up all 20 disks in a RAID-DP aggregate with one spare (17 data,
2 parity and one spare, or maybe two spares).
- Bond a NIC from the first controller with a NIC from the second
controller to give us a 2Gbps connection to our "storage network".
- Third and fourth NIC's would go to our regular network.

My hope was that I could lose one filer head and the other would take
over seamlessly. We'd lose half of our network bandwidth but still be
up and running.

However, it sounds like my understanding of how the clustering works
might have been a bit flawed and that I actually need to treat the
filer heads as two separate filers. So I may be forced to do something
like the following:

- Split my disks up between the two filers (7 data, 2 parity, one
spare -- or maybe I can have one spare available to both heads).
- Probably can't team NIC's from multiple filer heads meaning if I
team the two NIC's on the filer I can no longer connect to my
management network. I probably need to order more NIC's :(
- If I lose one head, I lose one aggregate unless manual intervention
is taken.
- Each filer has a different hostname/IP for network access.

This maybe gives me better performance, but at the expense of total
disk space and flexibility if my understanding is correct.

Maybe someone could help clear this up. It doesn't appear IBM has a
RedBook on clustering... I'm searching around in NOW and have come
across the Data ONTAP 7.3 Active/Active Configuration Guide which I am
now reading.

Is there something similar for Active/Passive setups (which seems to be
more what I am after) or other documents that would be recommended
reading? Any advice or best practices?

This filer will be serving NFS to a pair of ESX servers. We plan to
add a second shelf of disks later this year.

Thanks in advance. No sales inquiries please.

Ray


rvandolson at esri

Jun 2, 2009, 9:00 AM

Post #3 of 11 (2052 views)
Permalink
Re: FAS2050C questions (clustering) [In reply to]

On Tue, Jun 02, 2009 at 08:54:35AM -0700, Fox, Adam wrote:
> You are correct that clustering is treated as two separate controllers
> which can take over for each other. You cannot vif across NICs on
> different controllers.
>
> If you want to do the closest thing to active/passive would be to
> allocate at least 2 (possible 3 if you want a spare) disks to the
> "passive" controller and the rest to the active one. I'd set up a raid4
> trad vol or aggregate for it since you only are going to use 2 disks,
> you don't need raid_dp. Definitely use raid_dp on the active
> controller.
>
> Under this scenario, you can lose either controller head and still be
> running.

Ah, so we need to have disks assigned to the "passive" controller in an
aggregate configuration? What if I just split the disks up evenly,
would the aggregate on "active" controller shift down to be controlled
by the "passive" controller automatically?

Maybe this would be preferrable to having 2 or 3 disks doing "nothing"
on the second head.

Thanks for the response.

>
> -- Adam Fox
> Systems Engineer
> adamfox[at]netapp.com

Ray


Adam.Fox at netapp

Jun 2, 2009, 9:03 AM

Post #4 of 11 (2046 views)
Permalink
RE: FAS2050C questions (clustering) [In reply to]

You can split them if you like. I only said do the 2-disk to the
"passive" side if you wanted an active/passive config. If you want to
go active/active, then split them up. Just be aware that depending on
your load, you may get spindle-bound at some point with that few disks
in your aggregate, but you may be fine until you get your new disks
later.

-- Adam Fox
Systems Engineer
adamfox[at]netapp.com


-----Original Message-----
From: Ray Van Dolson [mailto:rvandolson[at]esri.com]
Sent: Tuesday, June 02, 2009 12:01 PM
To: Fox, Adam
Cc: toasters[at]mathworks.com
Subject: Re: FAS2050C questions (clustering)

On Tue, Jun 02, 2009 at 08:54:35AM -0700, Fox, Adam wrote:
> You are correct that clustering is treated as two separate controllers
> which can take over for each other. You cannot vif across NICs on
> different controllers.
>
> If you want to do the closest thing to active/passive would be to
> allocate at least 2 (possible 3 if you want a spare) disks to the
> "passive" controller and the rest to the active one. I'd set up a
raid4
> trad vol or aggregate for it since you only are going to use 2 disks,
> you don't need raid_dp. Definitely use raid_dp on the active
> controller.
>
> Under this scenario, you can lose either controller head and still be
> running.

Ah, so we need to have disks assigned to the "passive" controller in an
aggregate configuration? What if I just split the disks up evenly,
would the aggregate on "active" controller shift down to be controlled
by the "passive" controller automatically?

Maybe this would be preferrable to having 2 or 3 disks doing "nothing"
on the second head.

Thanks for the response.

>
> -- Adam Fox
> Systems Engineer
> adamfox[at]netapp.com

Ray


peta.spies at gmail

Jun 2, 2009, 9:04 AM

Post #5 of 11 (2050 views)
Permalink
Re: FAS2050C questions (clustering) [In reply to]

You need more NICs (we hit this issue all the time).

Basically, set both filers up on both networks as if they were separate
filers. Given them separate IP addresses. If one head dies, the other head
will takeover all connections and assume the "personality" of the dead
filers. So the IPs and the disks of the dead filer will all be visile on
the live one.

The heads have an internal interconenct that makes this possible.

And everything Adam said about the disks.

Peta

2009/6/2 Fox, Adam <Adam.Fox[at]netapp.com>

> You are correct that clustering is treated as two separate controllers
> which can take over for each other. You cannot vif across NICs on
> different controllers.
>
> If you want to do the closest thing to active/passive would be to
> allocate at least 2 (possible 3 if you want a spare) disks to the
> "passive" controller and the rest to the active one. I'd set up a raid4
> trad vol or aggregate for it since you only are going to use 2 disks,
> you don't need raid_dp. Definitely use raid_dp on the active
> controller.
>
> Under this scenario, you can lose either controller head and still be
> running.
>
> -- Adam Fox
> Systems Engineer
> adamfox[at]netapp.com
>
> -----Original Message-----
> From: Ray Van Dolson [mailto:rvandolson[at]esri.com]
> Sent: Tuesday, June 02, 2009 11:44 AM
> To: toasters[at]mathworks.com
> Subject: FAS2050C questions (clustering)
>
> I'm the proud new owner of an IBM N3600 A20 (rebranded FAS2050C) with
> 20x30GB SAS disks.
>
> I'm trying to determine the best way to get this thing set up, and
> realized I have only a bit of a fuzzy understanding as to how the
> clustering or failover filer head should work.
>
> My initial thoughts were to aim for the following setup:
>
> - Set up all 20 disks in a RAID-DP aggregate with one spare (17 data,
> 2 parity and one spare, or maybe two spares).
> - Bond a NIC from the first controller with a NIC from the second
> controller to give us a 2Gbps connection to our "storage network".
> - Third and fourth NIC's would go to our regular network.
>
> My hope was that I could lose one filer head and the other would take
> over seamlessly. We'd lose half of our network bandwidth but still be
> up and running.
>
> However, it sounds like my understanding of how the clustering works
> might have been a bit flawed and that I actually need to treat the
> filer heads as two separate filers. So I may be forced to do something
> like the following:
>
> - Split my disks up between the two filers (7 data, 2 parity, one
> spare -- or maybe I can have one spare available to both heads).
> - Probably can't team NIC's from multiple filer heads meaning if I
> team the two NIC's on the filer I can no longer connect to my
> management network. I probably need to order more NIC's :(
> - If I lose one head, I lose one aggregate unless manual intervention
> is taken.
> - Each filer has a different hostname/IP for network access.
>
> This maybe gives me better performance, but at the expense of total
> disk space and flexibility if my understanding is correct.
>
> Maybe someone could help clear this up. It doesn't appear IBM has a
> RedBook on clustering... I'm searching around in NOW and have come
> across the Data ONTAP 7.3 Active/Active Configuration Guide which I am
> now reading.
>
> Is there something similar for Active/Passive setups (which seems to be
> more what I am after) or other documents that would be recommended
> reading? Any advice or best practices?
>
> This filer will be serving NFS to a pair of ESX servers. We plan to
> add a second shelf of disks later this year.
>
> Thanks in advance. No sales inquiries please.
>
> Ray
>
>


jeremy.page at gilbarco

Jun 2, 2009, 9:11 AM

Post #6 of 11 (2046 views)
Permalink
RE: FAS2050C questions (clustering) [In reply to]

Depending on your workload you could get a decent boost by splitting it
up since that means you'll have twice as much cache servicing your
requests. Keep in mind it's not just about the disks :)


-----Original Message-----
From: owner-toasters[at]mathworks.com [mailto:owner-toasters[at]mathworks.com]
On Behalf Of Fox, Adam
Sent: Tuesday, June 02, 2009 12:03 PM
To: Ray Van Dolson
Cc: toasters[at]mathworks.com
Subject: RE: FAS2050C questions (clustering)

You can split them if you like. I only said do the 2-disk to the
"passive" side if you wanted an active/passive config. If you want to
go active/active, then split them up. Just be aware that depending on
your load, you may get spindle-bound at some point with that few disks
in your aggregate, but you may be fine until you get your new disks
later.

-- Adam Fox
Systems Engineer
adamfox[at]netapp.com


-----Original Message-----
From: Ray Van Dolson [mailto:rvandolson[at]esri.com]
Sent: Tuesday, June 02, 2009 12:01 PM
To: Fox, Adam
Cc: toasters[at]mathworks.com
Subject: Re: FAS2050C questions (clustering)

On Tue, Jun 02, 2009 at 08:54:35AM -0700, Fox, Adam wrote:
> You are correct that clustering is treated as two separate controllers
> which can take over for each other. You cannot vif across NICs on
> different controllers.
>
> If you want to do the closest thing to active/passive would be to
> allocate at least 2 (possible 3 if you want a spare) disks to the
> "passive" controller and the rest to the active one. I'd set up a
raid4
> trad vol or aggregate for it since you only are going to use 2 disks,
> you don't need raid_dp. Definitely use raid_dp on the active
> controller.
>
> Under this scenario, you can lose either controller head and still be
> running.

Ah, so we need to have disks assigned to the "passive" controller in an
aggregate configuration? What if I just split the disks up evenly,
would the aggregate on "active" controller shift down to be controlled
by the "passive" controller automatically?

Maybe this would be preferrable to having 2 or 3 disks doing "nothing"
on the second head.

Thanks for the response.

>
> -- Adam Fox
> Systems Engineer
> adamfox[at]netapp.com

Ray



Please be advised that this email may contain confidential information.
If you are not the intended recipient, please do not read, copy or
re-transmit this email. If you have received this email in error,
please notify us by email by replying to the sender and by telephone
(call us collect at +1 202-828-0850) and delete this message and any
attachments. Thank you in advance for your cooperation and assistance.

In addition, Danaher and its subsidiaries disclaim that the content of
this email constitutes an offer to enter into, or the acceptance of,
any
contract or agreement or any amendment thereto; provided that the
foregoing disclaimer does not invalidate the binding effect of any
digital or other electronic reproduction of a manual signature that is
included in any attachment to this email.


rvandolson at esri

Jun 2, 2009, 9:58 AM

Post #7 of 11 (2045 views)
Permalink
Re: FAS2050C questions (clustering) [In reply to]

On Tue, Jun 02, 2009 at 09:26:17AM -0700, Steve Francis wrote:
> You can get a performance boost by splitting it up.
> Which is why I don't like to do that, in general, for performance
> sensitive workloads. :-)
>
> Otherwise, in the event of a head failure, the surviving head may not
> have the CPU/cache to deal with the extra work - even if both heads
> are normally under 50% load.
> Most workloads grow linearly - until they hit an elbow an don't grow linearly.
> The only true way to be sure you have failover capacity is to run that
> way all the time.
>
> Your mileage/budget/workload/performance requirements may vary.

Thanks all. Great advice. So, my question is: if the second head can
take over the personality of the failed head, why do I need to allocate
any disks at all to the second head to begin with? Just a design
thing?

I'll probably do the even split thing.... and look into ordering
additional NIC's.

Can I have a "spare" that is available to either aggregate on either
filer? This way I could do two RAID-DP's on each head with one common
"spare" disk and rely on 4hr support to get me replacement disks
quickly.

Ray


Adam.Fox at netapp

Jun 2, 2009, 10:57 AM

Post #8 of 11 (2045 views)
Permalink
RE: FAS2050C questions (clustering) [In reply to]

The reason is that the cluster is built as an active/active cluster and
so each controller must have a root volume, thus each needs a trad vol
or 1 aggregate. So the "active/passive" config isn't truly
active/passive, it's just functionally so.

There is no way currently to automatically float a spare. You could
manually change ownership of a spare disk, but it doesn't do this
automatically. ONTAP will whine in the messages file if there are no
spares, and you will probably want to up the raid.timeout option to give
you more time to move the spare over.

-- Adam Fox
Systems Engineer
adamfox[at]netapp.com

-----Original Message-----
From: Ray Van Dolson [mailto:rvandolson[at]esri.com]
Sent: Tuesday, June 02, 2009 12:59 PM
To: Steve Francis
Cc: Page, Jeremy; Fox, Adam; toasters[at]mathworks.com
Subject: Re: FAS2050C questions (clustering)

On Tue, Jun 02, 2009 at 09:26:17AM -0700, Steve Francis wrote:
> You can get a performance boost by splitting it up.
> Which is why I don't like to do that, in general, for performance
> sensitive workloads. :-)
>
> Otherwise, in the event of a head failure, the surviving head may not
> have the CPU/cache to deal with the extra work - even if both heads
> are normally under 50% load.
> Most workloads grow linearly - until they hit an elbow an don't grow
linearly.
> The only true way to be sure you have failover capacity is to run that
> way all the time.
>
> Your mileage/budget/workload/performance requirements may vary.

Thanks all. Great advice. So, my question is: if the second head can
take over the personality of the failed head, why do I need to allocate
any disks at all to the second head to begin with? Just a design
thing?

I'll probably do the even split thing.... and look into ordering
additional NIC's.

Can I have a "spare" that is available to either aggregate on either
filer? This way I could do two RAID-DP's on each head with one common
"spare" disk and rely on 4hr support to get me replacement disks
quickly.

Ray


rvandolson at esri

Jun 2, 2009, 11:27 AM

Post #9 of 11 (2049 views)
Permalink
Re: FAS2050C questions (clustering) [In reply to]

On Tue, Jun 02, 2009 at 10:57:07AM -0700, Fox, Adam wrote:
> The reason is that the cluster is built as an active/active cluster and
> so each controller must have a root volume, thus each needs a trad vol
> or 1 aggregate. So the "active/passive" config isn't truly
> active/passive, it's just functionally so.

Ah-ha... the light has turned on in my head. Makes sense.

>
> There is no way currently to automatically float a spare. You could
> manually change ownership of a spare disk, but it doesn't do this
> automatically. ONTAP will whine in the messages file if there are no
> spares, and you will probably want to up the raid.timeout option to give
> you more time to move the spare over.
>

Gotcha. I'll just have to think about the RAID-4 vs RAID-6 thing then.
Extra 300GB of space might be nice...

Thanks,
Ray


jack1729 at gmail

Jun 3, 2009, 3:15 AM

Post #10 of 11 (2006 views)
Permalink
Re: FAS2050C questions (clustering) [In reply to]

Be careful about runnning without a spare on each head.
If the filer panic's you will not get the core dump unless you enable
the 'wait until core dump writes to the file system before failing over'
option. which then make failover take a long time and probably cause
other problems.

I am not sure about the 2050 series but assume this is the same as the
3000 series.

Jack

Ray Van Dolson wrote:
> On Tue, Jun 02, 2009 at 09:26:17AM -0700, Steve Francis wrote:
>
>> You can get a performance boost by splitting it up.
>> Which is why I don't like to do that, in general, for performance
>> sensitive workloads. :-)
>>
>> Otherwise, in the event of a head failure, the surviving head may not
>> have the CPU/cache to deal with the extra work - even if both heads
>> are normally under 50% load.
>> Most workloads grow linearly - until they hit an elbow an don't grow linearly.
>> The only true way to be sure you have failover capacity is to run that
>> way all the time.
>>
>> Your mileage/budget/workload/performance requirements may vary.
>>
>
> Thanks all. Great advice. So, my question is: if the second head can
> take over the personality of the failed head, why do I need to allocate
> any disks at all to the second head to begin with? Just a design
> thing?
>
> I'll probably do the even split thing.... and look into ordering
> additional NIC's.
>
> Can I have a "spare" that is available to either aggregate on either
> filer? This way I could do two RAID-DP's on each head with one common
> "spare" disk and rely on 4hr support to get me replacement disks
> quickly.
>
> Ray
>
>


johnpc at xs4all

Jun 3, 2009, 7:25 AM

Post #11 of 11 (1995 views)
Permalink
Re: FAS2050C questions (clustering) [In reply to]

On Tue, Jun 02, 2009 at 05:04:29PM +0100, Peta Spies wrote:
> You need more NICs (we hit this issue all the time).

Not necessarily. If you now have 2 NICs on both filers, simply VIF those
NICs, then configure a vlan trunk over that etherchannel, and create
vlan interfaces. Make sure to create interfaces for all vlans on both
filers, so they can failover to eachother.

This assumes that your management network is actually just a vlan on the
same infrastructure. If it's separate hardware, you do need more NICs

Ray Van Dolson wrote:
> Gotcha. I'll just have to think about the RAID-4 vs RAID-6 thing then.
> Extra 300GB of space might be nice...

Don't. The extra headache you get when your performance goes down the
drain whenever you have a high priority RAID rebuild, or the chance of
data loss because there's a second failure during rebuild, REALLY
outweighs the cost of a single extra disk. RAID-DP uses low priority
rebuilds which take longer, but don't have such a big impact on your
production data (unless you get into double degraded mode - which would
have meant data loss on RAID-4).

Also, RAID-DP raid groups can be larger than RAID-4 raid groups, so you
can more easily extend your RAID-DP based volumes (or aggregates), and
if you do, RAID-DP is just as efficient as RAID-4, just vastly more
reliable. It does add a little more overhead, but if you want raw
performance, just use RAID-0, and recover from disk crashes using
"newfs" (we do that - but not on netapp hardware obviously - and you'll
have to add redundancy at another level).

--
Jan-Pieter Cornet <johnpc[at]xs4all.nl>
!! Disclamer: The addressee of this email is not the intended recipient. !!
!! This is only a test of the echelon and data retention systems. Please !!
!! archive this message indefinitely to allow verification of the logs. !!

Netapp toasters RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.