
Darren.Sykes at csr
Sep 1, 2009, 3:26 AM
Post #3 of 8
(2277 views)
Permalink
|
|
RE: SQL 2005 reacts badly to a cluster giveback ?
[In reply to]
|
|
The ARP cache issue wouldn't really explain why Exchange reacts better. However, I suppose you could verify that theory by attempting a failover on a cluster than is not on the same subnet as the iSCSI client, or decrease the ARP timeout (an entry under [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters] IIRC). Darren -----Original Message----- From: owner-toasters [at] mathworks [mailto:owner-toasters [at] mathworks] On Behalf Of Filip Sneppe Sent: 31 August 2009 17:26 To: Raj Patel Cc: toasters [at] mathworks Subject: Re: SQL 2005 reacts badly to a cluster giveback ? Hi, I can add a "me too" message to this post. I've had more or less the same experience at two customer sites (albeit on physical machines, where I ran into issues with the MSSQL servers and their iSCSI disks. I can't say that I've experienced the same sort of problems with eg. Exchange setups. Generally, when things are setup correctly wrt. disk timeouts, everything works fine. The SQL setups I had issues with have more recent versions of the MS iSCSI initiator (around 2.05/2.06 iirc), and I've also thought about upgrading to a more recent version. One thing I came across when investigating, is that Windows can have a very large ARP caching timeout, and during one test, it took the Windows SQL box until long after the filer had booted before the new MAC address was learned from the network. I think Windows 2000 and 2003 can cache an ARP entry for up to 10 minutes, so I really don't know how a disk timeout of 190 seconds is theoretically sufficient for NetApp cluster failovers. So I would like to know if anyone has experienced the same sort of things, in particular with MS SQL servers and iSCSI. Regards, Filip On Mon, Aug 31, 2009 at 2:05 AM, Raj Patel<phigmov [at] gmail> wrote: > Hi. > > We've had a couple of cluster-failover events on our FAS270c (watchdog > errors every time) on 7.2.5.1 > > The failover is fine (AFAIK) when one of the nodes reboots - however in the > Giveback it appears that the SQL server has a couple of initiator errors > events logged and although the drives are visible (and working in terms of > I/O) and the SQL services are still running any SQL dependent applications > just don't work after the giveback. As soon as I stop/start the SQL services > its all back to normal (or I reboot the box). > > Server is Windows 2003sp2, its a VM on ESX3.5, the iSCSI interface goes > through a dedicated iSCSI NIC (a virtual switch which also carries the ESX > iSCSI LUN's) Snapdrive is 6.01, iSCSI initiator is 2.03 and its a 32bit VM. > > Oddly Exchange didn't miss a beat (they're physical Windows 2008 64 bit > servers) but SQL was definitely unhappy (even though the SQL service itself > carried on - ie it didn't stop). > > Any ideas ? I note theres a newer iSCSI initiator available (2.08) from > Microsoft. I'm pretty sure we haven't had this Giveback issue with our old > SnapDrive 4.2.1 setup on the same server. > > Thanks in advance, > Raj. > To report this email as spam click https://www.mailcontrol.com/sr/wQw0zmjPoHdJTZGyOCrrhg== DPJR0BclKWgOsHu6LKDaZ!IFATt2KLQNAhmYIqzE2R4VA== . Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
|