
klises at sutterhealth
May 17, 2012, 10:33 AM
Post #3 of 3
(650 views)
Permalink
|
|
RE: Reallocate redux: Reallocation and de-dupe
[In reply to]
|
|
Not to beat a dead horse here. But here is a trailing question. I have mostly windows vm's and most were not initially aligned (still have ~ 60vms to go out of a few hundred). I read in the DOT 8 7mode sys admin guide (pg 322), to immediately after creating the lun, to setup a reallocate job. We run VSC and take snapshots daily. I am going to run the following command on each vol. I have 1 vol and 1 lun inside that vol. reallocate -f -p /vol/vmware_vmware_01_sata Do most folks run reallocate, and if so, how often? These are standard windows (wk203,wk208) servers.. I know the mileage will vary here.. Thanks for posting this Peta. -----Original Message----- From: toasters-bounces [at] teaparty [mailto:toasters-bounces [at] teaparty] On Behalf Of Jeff Mother Sent: Sunday, May 13, 2012 5:22 PM To: Peta Thames Cc: Toasters [at] teaparty Subject: Re: Reallocate redux: Reallocation and de-dupe :) Sent from my iPhone On May 13, 2012, at 4:54 PM, Peta Thames <petathames [at] gmail> wrote: > Hi all, > > So I've been able to reallocate about 20% of our luns so far, and it's > already made a huge improvement. I still see some huge latency spikes > (up to 500ms) on some luns, mostly ones I haven't reallocated yet, but > the spikes I originally described, where I had spikes on every volume > in the aggregate, sometimes the entire filer, at the same time, have > disappeared. Our VMware guy and the Citrix team are also reporting > performance improvements. I must admit, I'm surprised by how quickly > reallocate has improved this environment, to the point where users are > already noticing it, and I'm nowhere near finished. If you've never > checked your reallocation measurement, I highly recommend it! Thanks > especially to Jeff Mohler and Fletcher Cocquyt (I found your blog > posts through google before you'd emailed and thank you for writing > them, they are really good). > > I was thinking over the weekend, as you do, about why I'd sometimes > see spikes on multiple aggregates at once, and a common denominator > there would be the backend loops. CPU has never been an issue. Does > it make sense that the loops become a bottleneck on systems with > severe fragmentation? Everything on the filer queues up there behind > the IO going to one disk? > > Anyway, the question I really would like answered this time is: How > best to balance reallocation and de-dupe. On Friday I reallocated a > lun that's been de-duplicated, and the measurement changed from 6, > hotspot 0, to 4, hotspot 28. Obviously the hotspot is due to the > de-duplicated blocks. So, is it better to leave a higher reallocate > measure (in this case, 6), with no hotspot, or better to reallocate > these luns to lower the overall measure, even though it creates a > horrible hotspot? > > Thanks, > Peta > > On 30 April 2012 03:29, Milazzo Giacomo <G.Milazzo [at] sinergy> wrote: >> >> At last am answer with sense J >> >> I explain it better. I’m referring to what Fletcher wrote: “It was maddening to me back in 2010 how netapp support could blockade support cases with a blanket "must align VMs first" without a real quantification of the impact of misalignment – see” >> >> We had some critical case of this “performances” issues and in both cases the first thing that (support) asked to do: realign! >> >> Well. After a long and tedious process of realignment things were better…meanly 5% better, not more!!! And don’t let customer see graphs coming from the CMPG portal! Something like this attached…this could be pure terrorism :-D and a lot of nightmare for you trying to explain (and understand) the meaning of buckets J >> >> >> Alignment is important…but NOT SO important. I concentrate investigation first on other levels. >> >> >> >> Reagards, >> >> >> >> >> Da: toasters-bounces [at] teaparty [mailto:toasters-bounces [at] teaparty] Per conto di Fletcher Cocquyt >> Inviato: venerdì 27 aprile 2012 08:46 >> A: Peta Thames >> Cc: Toasters [at] teaparty >> Oggetto: Re: Reallocation and VMware >> >> >> >> Peta - we were dealing with this very issue (unexplained latency spikes Netapp blamed on VM misalignment) >> >> back in 2010 - I wrote up how we deconstructed the IOPs after many wasted perfstat iterations >> >> to solve it pretty much on our own: >> >> >> >> http://www.vmadmin.info/2010/07/vmware-and-netapp-deconstructing.html >> >> >> >> It was maddening to me back in 2010 how netapp support could blockade support cases with a >> >> blanket "must align VMs first" without a real quantification of the impact of misalignment - see >> >> >> >> http://www.vmadmin.info/2010/07/quantifying-vmdk-misalignment.html >> >> >> >> We ended up taking the downtime back then to align all VMs >> >> But now, I would be one to encourage your making the leap to 8.x - we are on 8.1GA and we are not looking back. >> >> The data motion of vFilers is allowing us to upgrade clusters with no downtime >> >> >> >> http://www.vmadmin.info/2012/04/meta-storage-vmotion-netapp-datamotion.html >> >> >> >> They have me almost believing in cluster mode for scale out... >> >> >> >> >> >> >> >> >> >> >> >> On Apr 25, 2012, at 11:40 PM, Peta Thames wrote: >> >> >> >> Hi Jack, >> >> You're right, and I should have mentioned it before. Large numbers of >> the VMDKs are misaligned. I'd estimate about 33%, but I don't know >> exactly how many as the shiny new VSC scanner got stuck halfway >> through the scan I ran, leaving several VMs in a "being scanned" >> state. I have a case open with Netapp to find out how to get those >> VMs out of that state so I can a) continue the scan b) schedule fixing >> the misaligned luns. >> >> Not all the luns that have large latency spikes are misaligned >> however. Mind you, by the same token, not all of them are fragmented, >> although so far (I'm still getting through measuring them all) there's >> definitely a strong correlation. >> >> I also have to admit that I read the scale wrong in perf advisor, and >> the numbers I'm seeing are in microseconds, not milliseconds. Still >> way more than the 10ms I would like, but an order of magnitude better! >> >> Peta >> >> On 26 April 2012 15:52, Jack Lyons <jack1729 [at] gmail> wrote: >> >> Have you checked the alignment of the VMDK's? >> >> >> >> Jack >> >> Sent from my Verizon Wireless BlackBerry >> >> >> >> -----Original Message----- >> >> From: Peta Thames <petathames [at] gmail> >> >> Sender: toasters-bounces [at] teaparty >> >> Date: Thu, 26 Apr 2012 14:49:43 >> >> To: <Toasters [at] teaparty> >> >> Subject: Reallocation and VMware >> >> >> >> Hi all, >> >> >> >> I'd like to pick your collective brains about your experiences with >> >> reallocate, specifically when reallocating luns under VMware. >> >> >> >> For background, we're running ONTAP 8.0.1 on a 3170 that's over three >> >> years old. I've been going through measuring reallocation, and most >> >> of the volumes are over 3. We have no snapshots, and only a >> >> relatively small number of volumes are de-duplicated. All our volumes >> >> and luns are thin-provisioned, and no aggregate is more than 76% full >> >> (most are ~65%). We regularly have huge latency spikes (worst I've >> >> seen so far is 5000000ms, and there are far too many to even track >> >> over 50000ms daily), and on one filer head, but not its partner, I >> >> regularly see disk utilisation go to 100% or more. I'm hoping >> >> reallocate will help here. >> >> >> >> I have a brief note from a NetApp support person who says "It’s very >> >> important that you complete the reallocation in the following order: >> >> 1:OS 2:LUN 3: Volume". >> >> >> >> I have two questions about this: >> >> - is it absolutely necessary to defrag the OS before you reallocate >> >> the lun? I'm sure I've run reallocate without defraging the OS and >> >> still seen performance improvements. I'm also assuming that this is >> >> only relevant to Windows VMs, not Linux (in our case, Red Hat/CentOS) >> >> ones. >> >> - if you only have one lun per volume, do you still need to run >> >> reallocate on both the lun and the volume? If only one, which is >> >> preferable? >> >> >> >> All advice appreciated. >> >> >> >> Thanks, >> >> Peta >> >> >> >> _______________________________________________ >> >> Toasters mailing list >> >> Toasters [at] teaparty >> >> http://www.teaparty.net/mailman/listinfo/toasters >> >> >> _______________________________________________ >> Toasters mailing list >> Toasters [at] teaparty >> http://www.teaparty.net/mailman/listinfo/toasters >> >> > > _______________________________________________ > Toasters mailing list > Toasters [at] teaparty > http://www.teaparty.net/mailman/listinfo/toasters _______________________________________________ Toasters mailing list Toasters [at] teaparty http://www.teaparty.net/mailman/listinfo/toasters _______________________________________________ Toasters mailing list Toasters [at] teaparty http://www.teaparty.net/mailman/listinfo/toasters
|