Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Netapp: toasters

Physical reallocate for a deduped volume

 

 

Netapp toasters RSS feed   Index | Next | Previous | View Threaded


matt.kilham at strattonfinance

Mar 19, 2012, 10:12 PM

Post #1 of 6 (1520 views)
Permalink
Physical reallocate for a deduped volume

Hi Toasters,



I've posed the following question to NetApp support but wasn't too confident in the answer I received, so I was hoping the knowledgeable folks on this list could assist? Also, before I begin, please excuse any ignorance or NetApp faux pas on my behalf, I'm far from an expert at administering NetApp filers.

---

We have a FAS2050C running DOT 7.3.4. We recently added an external disk shelf to the filer and performed a bit of reorganisation, which has left us with two unused disks in the internal shelf. We want to add these two disks to an existing 16-disk aggregate by expanding it's RAID group size from 16 to 18. The aggregate is currently ~80% full and contains 5 flexvols, most of which have de-dupe and snapshots enabled.

I understand that after expanding the RG / aggregate we need to run a physical reallocate or else we will end up with a significant "hot spot" on the newly-added disks. I also understand that we need ensure DOT only does a physical redistribution of the blocks across all disks in the aggregate and does not try to otherwise rearrange them, otherwise we risk "un-deduping" our deduplicated data and/or greatly increasing the space consumed by our snapshots.

First question: I believe the command we need to run to perform the above style of reallocation is "reallocate start -f -p /vol/<volname>" - is that correct?

I noted, however, that there is a warning with regards to the "reallocate -p" command in the release notes of 7.3.x. Unfortunately I can't find a link anymore, but it reads:

"A file reallocation scan using reallocate start or reallocate start -p does not rearrange blocks that are shared between files by deduplication on deduplicated volumes. Since a file reallocation scan does not predictably improve read performance when used on deduplicated volumes, performing file reallocation on deduplicated volumes is not recommended. Instead, for files to benefit from the reallocation scan, they should be stored on volumes that are not enabled for deduplication."

I then queried with NetApp how we could perform a physical reallocate on the aggregate to avoid a hot spot, given the above statement that "reallocate start -p does not rearrange blocks that are shared between files by deduplication on deduplicated volumes", and almost all of our data is deduped.

After a bit of to-ing and fro-ing with NetApp, the final suggestion was as follows:

"I’ve talked to two senior resources so far about this. The suggested steps are as follows>>
#1: add your disks to the aggregate, and then grow the volume.
#2: Turn off Deduplication, ( no, this will not revert to the 2.5 gigs from the one gig ratio it is now and suddenly cause you to be out of space) this will only stop deduplication from this point forward until it is re-enabled later. Everything that is currently deduped will stay the ratio it is.
#3: Perform the reallocate –p command as specified in the KB’s sent earlier.
#4: After all above steps are performed and reallocate is done, re-enable deduplication on said volumes."


Second question: Does this sound kosher - will this actually result in a physical reallocate being performed, or will the deduped blocks still be skipped because they're already de-duplicated, regardless of whether dedupe is enabled for new data on the volume or not?

Third question: If yes to the previous question, is there any risk that our deduped data get un-deduped by performing the suggested steps above?

Fourth and final question: Is there any way to measure how data is spread across disks in a Raid Group / aggregate, so that I can check that we get our expected results from the reallocate?


Thanks everyone, really appreciate any feedback / assistance.


Cheers,
Matt


speedtoys.racing at gmail

Mar 19, 2012, 10:45 PM

Post #2 of 6 (1482 views)
Permalink
Re: Physical reallocate for a deduped volume [In reply to]

Yup, its the best you can do at this time.

DO NOT do a reallocate -A.

DO NOT.



On Mon, Mar 19, 2012 at 10:12 PM, Mathew Kilham <
matt.kilham [at] strattonfinance> wrote:

> Hi Toasters,
>
>
>
> I've posed the following question to NetApp support but wasn't too
> confident in the answer I received, so I was hoping the knowledgeable folks
> on this list could assist? Also, before I begin, please excuse any
> ignorance or NetApp faux pas on my behalf, I'm far from an expert at
> administering NetApp filers.
>
> ---
>
> We have a FAS2050C running DOT 7.3.4. We recently added an external disk
> shelf to the filer and performed a bit of reorganisation, which has left us
> with two unused disks in the internal shelf. We want to add these two disks
> to an existing 16-disk aggregate by expanding it's RAID group size from 16
> to 18. The aggregate is currently ~80% full and contains 5 flexvols, most
> of which have de-dupe and snapshots enabled.
>
> I understand that after expanding the RG / aggregate we need to run a
> physical reallocate or else we will end up with a significant "hot spot" on
> the newly-added disks. I also understand that we need ensure DOT only does
> a physical redistribution of the blocks across all disks in the aggregate
> and does not try to otherwise rearrange them, otherwise we risk
> "un-deduping" our deduplicated data and/or greatly increasing the space
> consumed by our snapshots.
>
> First question: I believe the command we need to run to perform the above
> style of reallocation is "reallocate start -f -p /vol/<volname>" - is that
> correct?
>
> I noted, however, that there is a warning with regards to the "reallocate
> -p" command in the release notes of 7.3.x. Unfortunately I can't find a
> link anymore, but it reads:
>
> "A file reallocation scan using reallocate start or reallocate start -p
> does not rearrange blocks that are shared between files by deduplication on
> deduplicated volumes. Since a file reallocation scan does not predictably
> improve read performance when used on deduplicated volumes, performing file
> reallocation on deduplicated volumes is not recommended. Instead, for files
> to benefit from the reallocation scan, they should be stored on volumes
> that are not enabled for deduplication."
>
> I then queried with NetApp how we could perform a physical reallocate on
> the aggregate to avoid a hot spot, given the above statement that
> "reallocate start -p does not rearrange blocks that are shared between
> files by deduplication on deduplicated volumes", and almost all of our data
> is deduped.
>
> After a bit of to-ing and fro-ing with NetApp, the final suggestion was as
> follows:
>
> "I’ve talked to two senior resources so far about this. The suggested
> steps are as follows>>
> #1: add your disks to the aggregate, and then grow the volume.
> #2: Turn off Deduplication, ( no, this will not revert to the 2.5 gigs
> from the one gig ratio it is now and suddenly cause you to be out of space)
> this will only stop deduplication from this point forward until it is
> re-enabled later. Everything that is currently deduped will stay the ratio
> it is.
> #3: Perform the reallocate –p command as specified in the KB’s sent
> earlier.
> #4: After all above steps are performed and reallocate is done, re-enable
> deduplication on said volumes."
>
>
> Second question: Does this sound kosher - will this actually result in a
> physical reallocate being performed, or will the deduped blocks still be
> skipped because they're already de-duplicated, regardless of whether dedupe
> is enabled for new data on the volume or not?
>
> Third question: If yes to the previous question, is there any risk that
> our deduped data get un-deduped by performing the suggested steps above?
>
> Fourth and final question: Is there any way to measure how data is spread
> across disks in a Raid Group / aggregate, so that I can check that we get
> our expected results from the reallocate?
>
>
> Thanks everyone, really appreciate any feedback / assistance.
>
>
> Cheers,
> Matt
>
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters
>
>


--
---
Gustatus Similis Pullus


matt.kilham at strattonfinance

Mar 19, 2012, 11:58 PM

Post #3 of 6 (1489 views)
Permalink
re[2]: Physical reallocate for a deduped volume [In reply to]

Hi Jeff & Toasters,


> Yup, its the best you can do at this time.

Thanks.

Do you specifically know that this will work as intended though? e.g., as per my original e-mail: "will this actually result in a physical reallocate being performed, or will the deduped blocks still be skipped because they're already de-duplicated, regardless of whether dedupe is enabled for new data on the volume or not?"

I'm re-querying because I struggle to see why de-dupe being on/off for new blocks would affect the ability to reallocate, whereas I can imagine scenarios where blocks /already/ being deduplicated could affect the ability to reallocate.


> DO NOT do a reallocate -A.

> DO NOT.

Thanks for that Jeff - we'll be very careful to avoid the -A option :-)


Cheers,
Matt


_______________________________________________
Toasters mailing list
Toasters [at] teaparty
http://www.teaparty.net/mailman/listinfo/toasters


fcocquyt at stanford

Mar 20, 2012, 7:00 AM

Post #4 of 6 (1483 views)
Permalink
Re: Physical reallocate for a deduped volume [In reply to]

I had a case open with Netapp and the engineer came back with this:

Data ONTAP® 7.3 System Administration Guide:
https://now.netapp.com/NOW/knowledge/docs/ontap/rel7351/pdfs/ontap/sysadmin.pdf

What a reallocation scan is, Page 315

“A file reallocation scan using reallocate start or reallocate start -p does not rearrange blocks that are shared between files by deduplication on deduplicated volumes. Because a file reallocation scan does not predictably improve read performance when used on deduplicated volumes, it is best not to perform file reallocation on deduplicated volumes. If you want your files to benefit from a reallocation scan, store them on volumes that are not enabled for deduplication.”

--

We ended up reallocating (in effect) when we upgraded to 8.1 using data motion to migrate to fresh aggregates (btw 2 clusters)
But I'd be interested to know if you get a different answer (and net benefit from reallocation on a dedup'ed volume)

thanks




On Mar 19, 2012, at 11:58 PM, Mathew Kilham wrote:

> Hi Jeff & Toasters,
>
>
>> Yup, its the best you can do at this time.
>
> Thanks.
>
> Do you specifically know that this will work as intended though? e.g., as per my original e-mail: "will this actually result in a physical reallocate being performed, or will the deduped blocks still be skipped because they're already de-duplicated, regardless of whether dedupe is enabled for new data on the volume or not?"
>
> I'm re-querying because I struggle to see why de-dupe being on/off for new blocks would affect the ability to reallocate, whereas I can imagine scenarios where blocks /already/ being deduplicated could affect the ability to reallocate.
>
>
>> DO NOT do a reallocate -A.
>
>> DO NOT.
>
> Thanks for that Jeff - we'll be very careful to avoid the -A option :-)
>
>
> Cheers,
> Matt
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters


speedtoys.racing at gmail

Mar 20, 2012, 11:06 AM

Post #5 of 6 (1472 views)
Permalink
Re: Physical reallocate for a deduped volume [In reply to]

Correct, but were not using reallocate here for its _intended_ purpose.

Were using it for what it happens to do as a benefit to wafl free space
management.



On Tue, Mar 20, 2012 at 7:00 AM, Fletcher Cocquyt <fcocquyt [at] stanford>wrote:

> I had a case open with Netapp and the engineer came back with this:
>
> Data ONTAP® 7.3 System Administration Guide:****
>
> https://now.netapp.com/NOW/knowledge/docs/ontap/rel7351/pdfs/ontap/sysadmin.pdf
> ****
> ** **
> *What a reallocation scan is, Page 315*
> ** **
> “A file reallocation scan using reallocate start or reallocate start -p
> does not rearrange blocks that are shared between files by deduplication on
> deduplicated volumes. Because a file reallocation scan does not predictably
> improve read performance when used on deduplicated volumes, it is best not
> to perform file reallocation on deduplicated volumes. If you want your
> files to benefit from a reallocation scan, store them on volumes that are
> not enabled for deduplication.”
>
> --
>
> We ended up reallocating (in effect) when we upgraded to 8.1 using data
> motion to migrate to fresh aggregates (btw 2 clusters)
> But I'd be interested to know if you get a different answer (and net
> benefit from reallocation on a dedup'ed volume)
>
> thanks
>
>
>
>
> On Mar 19, 2012, at 11:58 PM, Mathew Kilham wrote:
>
> Hi Jeff & Toasters,
>
>
> Yup, its the best you can do at this time.
>
>
> Thanks.
>
> Do you specifically know that this will work as intended though? e.g., as
> per my original e-mail: "will this actually result in a physical reallocate
> being performed, or will the deduped blocks still be skipped because
> they're already de-duplicated, regardless of whether dedupe is enabled for
> new data on the volume or not?"
>
> I'm re-querying because I struggle to see why de-dupe being on/off for new
> blocks would affect the ability to reallocate, whereas I can imagine
> scenarios where blocks /already/ being deduplicated could affect the
> ability to reallocate.
>
>
> DO NOT do a reallocate -A.
>
>
> DO NOT.
>
>
> Thanks for that Jeff - we'll be very careful to avoid the -A option :-)
>
>
> Cheers,
> Matt
>
>
> _______________________________________________
> Toasters mailing list
> Toasters [at] teaparty
> http://www.teaparty.net/mailman/listinfo/toasters
>
>
>


--
---
Gustatus Similis Pullus


matt.kilham at strattonfinance

Mar 20, 2012, 8:20 PM

Post #6 of 6 (1481 views)
Permalink
re[4]: Physical reallocate for a deduped volume [In reply to]

Hi Mike, Fletcher & the toasters list,
> This should explain everything about reallocation, have fun. Thanks for the document. I have read it before, but it's been updated since the last read, so I had a look over it again. The latest version now says: "4.4 DEDUPLICATION AND COMPRESSION
Starting in Data ONTAP 8.1 deduplicated data can be reallocated using physical reallocation or read_realloc space_optimized. Although data may be shared by multiple files when deduplicated, reallocate uses an intelligent algorithm to only reallocate the data the first time a shared block is
encountered. Prior versions of Data ONTAP do not support reallocation of deduplicated data and will skip any deduplicated data encountered."
Bearing in mind we're on 7.3.4 this just serves to reinforce my suspicion that the criteria for reallocate to ignore deduplicated data is whether each individual block is already deduplicated, not whether dedupe is currently enabled for the volume or not. Hence, if I'm correct, the suggested procedure from NetApp won't help us as any existing deduped data will simply be ignored (meaning basically none of our data will be reallocated). Does anyone have any information to the contrary - any information suggesting that I'm wrong? The big issue for us is that we can't test this - by the time we find out if the reallocate does what we want it to do we will already have grown the aggregate, and from what I understand there's no "going back" at this point.

> I had a case open with Netapp and the engineer came back with this: > > Data ONTAP&#174; 7.3 System Administration Guide:
> https://now.netapp.com/NOW/knowledge/docs/ontap/rel7351/pdfs/ontap/sysadmin.pdf"]https://now.netapp.com/NOW/knowledge/docs/ontap/rel7351/pdfs/ontap/sysadmin.pdf
>
> What a reallocation scan is, Page 315
>
> &#8220;A file reallocation scan using reallocate start or reallocate start -p does
> not rearrange blocks that are shared between files by deduplication on
> deduplicated volumes. Because a file reallocation scan does not predictably
> improve read performance when used on deduplicated volumes, it is best not to
> perform file reallocation on deduplicated volumes. If you want your files to
> benefit from a reallocation scan, store them on volumes that are not enabled
> for deduplication.&#8221; Thanks. More contradictory information from NetApp though unfortunately :-( The start of that paragraph again sounds to me like the criteria is whether a block is currently deduped or not, but then the last sentence specifically says "...on volumes that are not ENABLED for deduplication". Any of the NetApp engineers on this list care to throw in their two cents? > We ended up reallocating (in effect) when we upgraded to 8.1 using data motion to migrate to fresh aggregates (btw 2 clusters) I'd love to be able to do it this way, but we don't have the spare disks to be able to create a new aggregate and then move our data to it. As a fallback, we could /maybe/ shuffle data back and forth between a second existing aggregate and the first aggregate (the one we want to expand). I assume this would have something like the effect of physically reallocating the blocks as the data is moved back to the primary aggregate? Would be a giant PITA though.
> But I'd be interested to know if you get a different answer (and net benefit from reallocation on a dedup'ed volume)
I'll certainly let the list know what the final outcome is, if any.
Cheers, Matt

Netapp toasters RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.