Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Netapp: toasters

oracle coruption caused by NFS file system is full

 

 

Netapp toasters RSS feed   Index | Next | Previous | View Threaded


YLi at ea

Dec 5, 2008, 10:33 AM

Post #1 of 11 (3158 views)
Permalink
oracle coruption caused by NFS file system is full

Our firm runs oracle 9i using NFS file system from Netapp(3040 running 7.2), we have noticed few cases now that when the NFS file system is full, oracle database crash and can not even start after that.
It turns out that database has logical corruption, does anybody experience similar issues?

Thanks


lists at up-south

Dec 5, 2008, 11:28 AM

Post #2 of 11 (3041 views)
Permalink
Re: oracle coruption caused by NFS file system is full [In reply to]

Li, Jackie (Yanhui) wrote:
> Our firm runs oracle 9i using NFS file system from Netapp(3040 running
> 7.2), we have noticed few cases now that when the NFS file system is
> full, oracle database crash and can not even start after that.
>
> It turns out that database has logical corruption, does anybody
> experience similar issues?
>
>
>
> Thanks
>
more details please




--
--
Chaim Rieger
www.jravel.com


dleeds at edmunds

Dec 5, 2008, 11:35 AM

Post #3 of 11 (3049 views)
Permalink
RE: oracle coruption caused by NFS file system is full [In reply to]

this would occur on any storage IMHO. if your filesystem is 100% full it cannot write to it and if that filesystem contains things like redo logs, archive logs, and/or system tables generally bad things occur.

my first question would be why are they reaching 100% and do you have any filesystem monitoring in place? there is no reason your oracle filesystems should be reaching 100%

to me this is not a netapp/nfs issue but an operations issue--it would occur on nfs, san, or local disk.

--daniel


--
Daniel Leeds
Manager, Storage Operations
Edmunds, Inc.
1620 26th Street, Suite 400 South
Santa Monica, CA 90404

310-309-4999 desk
310-430-0536 cell



-----Original Message-----
From: owner-toasters [at] mathworks on behalf of Li, Jackie (Yanhui)
Sent: Fri 12/5/2008 10:33 AM
To: toasters [at] mathworks
Subject: oracle coruption caused by NFS file system is full

Our firm runs oracle 9i using NFS file system from Netapp(3040 running 7.2), we have noticed few cases now that when the NFS file system is full, oracle database crash and can not even start after that.

It turns out that database has logical corruption, does anybody experience similar issues?



Thanks


lists at up-south

Dec 5, 2008, 11:44 AM

Post #4 of 11 (3059 views)
Permalink
Re: oracle coruption caused by NFS file system is full [In reply to]

Li, Jackie (Yanhui) wrote:
> Our firm runs oracle 9i using NFS file system from Netapp(3040 running
> 7.2), we have noticed few cases now that when the NFS file system is
> full, oracle database crash and can not even start after that.
>
> It turns out that database has logical corruption, does anybody
> experience similar issues?
>
>
>
> Thanks
>
First rule is, *never* let your database filesystem get full! At the
very least, make sure your Redo logs are on a separate filesystem that
can *never* get full. This way any data changes resulting from your
datafiles running out of space (or other corruption) can be reversed
safely.




--
--
Chaim Rieger
www.jravel.com


Paul.Brosseau at netapp

Dec 5, 2008, 11:47 AM

Post #5 of 11 (3056 views)
Permalink
RE: oracle coruption caused by NFS file system is full [In reply to]

Like any other storage system, whether is it NAS (NFS, CIFS) or SAN
(SCSI, FC, iSCSI, FCoE), you should never, ever, ever allow it to become
completely full. Especially when using structured apps like databases.
Oracle crashes when the file system gets full (it would do the same with
a SAN attached lun) because Oracle constantly needs to have disk space
available to write to. Once full and crashed, it can't start up for the
same reason. No space available to write anything - data, logs, control
information updates, etc...



The simple solution is not to let that happen to you. If there is space
available in the containing aggregate you can use the volume autogrow
option to increase the size of the exported volume in definable
increments up to a definable maximum. Eventually though, you will run
out of space or reach the maximum and the same thing will happen.
Whenever your aggregate gets 80% full or higher you should start the
process to either free up space or add capacity to the system. Once the
system aggregate goes above 90% full you're in the red zone. If you let
it get more than 95% full that's your bad.



The percentage figures quoted here are my own professional opinion.
They are not company policy numbers endorsed by NetApp or any other
storage vendor (although I doubt they would disagree).



Hope this helps

Paul Brosseau
Systems Engineer
N/A East - Chesapeake Dist.
301-351-5165 Mobile
Paul.Brosseau [at] netapp
<mailto:Paul.Brosseau [at] netapp> www.netapp.com
<http://www.netapp.com/>





From: Li, Jackie (Yanhui) [mailto:YLi [at] ea]
Sent: Friday, December 05, 2008 1:34 PM
To: toasters [at] mathworks
Subject: oracle coruption caused by NFS file system is full



Our firm runs oracle 9i using NFS file system from Netapp(3040 running
7.2), we have noticed few cases now that when the NFS file system is
full, oracle database crash and can not even start after that.

It turns out that database has logical corruption, does anybody
experience similar issues?



Thanks


YLi at ea

Dec 5, 2008, 11:50 AM

Post #6 of 11 (3055 views)
Permalink
RE: oracle coruption caused by NFS file system is full [In reply to]

We had experience with SAN attached device, normally the database will just start fine after you increase the file system capacity.

________________________________
From: Leeds, Daniel [mailto:dleeds [at] edmunds]
Sent: Friday, December 05, 2008 11:35 AM
To: Li, Jackie (Yanhui); toasters [at] mathworks
Subject: RE: oracle coruption caused by NFS file system is full



this would occur on any storage IMHO. if your filesystem is 100% full it cannot write to it and if that filesystem contains things like redo logs, archive logs, and/or system tables generally bad things occur.

my first question would be why are they reaching 100% and do you have any filesystem monitoring in place? there is no reason your oracle filesystems should be reaching 100%

to me this is not a netapp/nfs issue but an operations issue--it would occur on nfs, san, or local disk.

--daniel


--
Daniel Leeds
Manager, Storage Operations
Edmunds, Inc.
1620 26th Street, Suite 400 South
Santa Monica, CA 90404

310-309-4999 desk
310-430-0536 cell



-----Original Message-----
From: owner-toasters [at] mathworks on behalf of Li, Jackie (Yanhui)
Sent: Fri 12/5/2008 10:33 AM
To: toasters [at] mathworks
Subject: oracle coruption caused by NFS file system is full

Our firm runs oracle 9i using NFS file system from Netapp(3040 running 7.2), we have noticed few cases now that when the NFS file system is full, oracle database crash and can not even start after that.

It turns out that database has logical corruption, does anybody experience similar issues?



Thanks


gdekhayser at voyantinc

Dec 8, 2008, 10:49 AM

Post #7 of 11 (3017 views)
Permalink
RE: oracle coruption caused by NFS file system is full [In reply to]

Just because the database *may* start up OK after such an event, by no means should you assume that it *should* start up OK. The ability to come back up is completely dependent upon what state the database was in (where it was in transactions, etc) when the filesystem filled up. The more active the database was when the filesystem choked, the higher likely you'll have corruption that you can't recover from.

I completely agree with the assessments of everyone else on the list- NEVER let your filesystem fill up.

That being said- you should look at the volume "auto grow" which will (sorry) automatically grow the volume when it hits certain thresholds. Could prevent disaster in the future.

Glenn Dekhayser
VoyantStrategies

From: owner-toasters [at] mathworks [mailto:owner-toasters [at] mathworks] On Behalf Of Li, Jackie (Yanhui)
Sent: Friday, December 05, 2008 2:50 PM
To: Leeds, Daniel; toasters [at] mathworks
Subject: RE: oracle coruption caused by NFS file system is full

We had experience with SAN attached device, normally the database will just start fine after you increase the file system capacity.

________________________________
From: Leeds, Daniel [mailto:dleeds [at] edmunds]
Sent: Friday, December 05, 2008 11:35 AM
To: Li, Jackie (Yanhui); toasters [at] mathworks
Subject: RE: oracle coruption caused by NFS file system is full



this would occur on any storage IMHO. if your filesystem is 100% full it cannot write to it and if that filesystem contains things like redo logs, archive logs, and/or system tables generally bad things occur.

my first question would be why are they reaching 100% and do you have any filesystem monitoring in place? there is no reason your oracle filesystems should be reaching 100%

to me this is not a netapp/nfs issue but an operations issue--it would occur on nfs, san, or local disk.

--daniel


--
Daniel Leeds
Manager, Storage Operations
Edmunds, Inc.
1620 26th Street, Suite 400 South
Santa Monica, CA 90404

310-309-4999 desk
310-430-0536 cell



-----Original Message-----
From: owner-toasters [at] mathworks on behalf of Li, Jackie (Yanhui)
Sent: Fri 12/5/2008 10:33 AM
To: toasters [at] mathworks
Subject: oracle coruption caused by NFS file system is full

Our firm runs oracle 9i using NFS file system from Netapp(3040 running 7.2), we have noticed few cases now that when the NFS file system is full, oracle database crash and can not even start after that.

It turns out that database has logical corruption, does anybody experience similar issues?



Thanks


silkey at ece

Dec 8, 2008, 11:10 AM

Post #8 of 11 (3012 views)
Permalink
Re: oracle coruption caused by NFS file system is full [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Li, Jackie (Yanhui) wrote:
> Our firm runs oracle 9i using NFS file system from Netapp(3040 running
> 7.2), we have noticed few cases now that when the NFS file system is
> full, oracle database crash and can not even start after that.
>
> It turns out that database has logical corruption, does anybody
> experience similar issues?

If you log Oracle to the same vol with Oracle data, ruh roh.

I recommend splitting Oracle online (and archive) logs to their own
volumes to mitigate damage should Oracle data vols fill. This should
allow you to replay to a consistent point-in-time once youve resolved
the full data volume problem. I also recommend multi-plexing the logs
to NAS and local disk just to be paranoid.

You can also flip a bit on your filer (vol autosize; done on a
per-volume basis) to auto-grow a volume a specified value should you hit
a threshold. This assumed you have spare aggr capacity. Use wisely.

Hope this helps your avoid future issues.

- --
Nick Silkey


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)

iEYEARECAAYFAkk9cQkACgkQrDQjhjXrMeIILACfSDr/m82LgqLEsWvKdaTICAm3
wxkAoI4ELXH946M/DGDeFWgaB36iwWVi
=vouj
-----END PGP SIGNATURE-----


mzito at gridapp

Dec 8, 2008, 12:07 PM

Post #9 of 11 (2999 views)
Permalink
RE: oracle coruption caused by NFS file system is full [In reply to]

This is also why you don't enable autoextend on tablespaces, and put
your archive logs on a separate filesystem. Space management is part of
the day to day responsibility of a DBA. Running out of space is a Very
Bad Thing for databases.



Thanks,

Matt



________________________________

From: owner-toasters [at] mathworks [mailto:owner-toasters [at] mathworks]
On Behalf Of Glenn Dekhayser
Sent: Monday, December 08, 2008 1:49 PM
To: Li, Jackie (Yanhui); toasters [at] mathworks
Subject: RE: oracle coruption caused by NFS file system is full



Just because the database *may* start up OK after such an event, by no
means should you assume that it *should* start up OK. The ability to
come back up is completely dependent upon what state the database was in
(where it was in transactions, etc) when the filesystem filled up. The
more active the database was when the filesystem choked, the higher
likely you'll have corruption that you can't recover from.



I completely agree with the assessments of everyone else on the list-
NEVER let your filesystem fill up.



That being said- you should look at the volume "auto grow" which will
(sorry) automatically grow the volume when it hits certain thresholds.
Could prevent disaster in the future.



Glenn Dekhayser

VoyantStrategies



From: owner-toasters [at] mathworks [mailto:owner-toasters [at] mathworks]
On Behalf Of Li, Jackie (Yanhui)
Sent: Friday, December 05, 2008 2:50 PM
To: Leeds, Daniel; toasters [at] mathworks
Subject: RE: oracle coruption caused by NFS file system is full



We had experience with SAN attached device, normally the database will
just start fine after you increase the file system capacity.



________________________________

From: Leeds, Daniel [mailto:dleeds [at] edmunds]
Sent: Friday, December 05, 2008 11:35 AM
To: Li, Jackie (Yanhui); toasters [at] mathworks
Subject: RE: oracle coruption caused by NFS file system is full





this would occur on any storage IMHO. if your filesystem is 100% full
it cannot write to it and if that filesystem contains things like redo
logs, archive logs, and/or system tables generally bad things occur.

my first question would be why are they reaching 100% and do you have
any filesystem monitoring in place? there is no reason your oracle
filesystems should be reaching 100%

to me this is not a netapp/nfs issue but an operations issue--it would
occur on nfs, san, or local disk.

--daniel


--
Daniel Leeds
Manager, Storage Operations
Edmunds, Inc.
1620 26th Street, Suite 400 South
Santa Monica, CA 90404

310-309-4999 desk
310-430-0536 cell



-----Original Message-----
From: owner-toasters [at] mathworks on behalf of Li, Jackie (Yanhui)
Sent: Fri 12/5/2008 10:33 AM
To: toasters [at] mathworks
Subject: oracle coruption caused by NFS file system is full

Our firm runs oracle 9i using NFS file system from Netapp(3040 running
7.2), we have noticed few cases now that when the NFS file system is
full, oracle database crash and can not even start after that.

It turns out that database has logical corruption, does anybody
experience similar issues?



Thanks


YLi at ea

Dec 8, 2008, 1:30 PM

Post #10 of 11 (3002 views)
Permalink
RE: oracle coruption caused by NFS file system is full [In reply to]

Thanks everybody's input, I have get a lot of replies, the best thing to do is obvious not get volume full.
In our scenario, these are test/development databases, all the oracle files are on single NFS volume , database is not
Running in archive mode, The argument so far is :


Is corruption a normal behavior after oracle file system is full or it should never? Could this be anything particular related to NFS since we had never observed this on SAN?

Jackie


Thanks
-----Original Message-----
From: ericgrancher [at] gmail [mailto:ericgrancher [at] gmail] On Behalf Of Eric Grancher
Sent: Monday, December 08, 2008 1:13 PM
To: Matthew Zito; Glenn Dekhayser; Li, Jackie (Yanhui); toasters [at] mathworks
Subject: Re: oracle coruption caused by NFS file system is full

dear all,

(being quite silent on the list, I take the occasion to say that I
appreciate very much the contributions and for once that I can
contribute...)

From my understanding and experience (we got hit once at CERN when we
started using DOT devices/NFS for Oracle databases, then we understood
the issue and took the right actions to prevent this from happening
again), the issue can typically be linked with snapshots: one (or
more) snapshots having been created on the volume, Oracle needs to
update a block, which turns to the need of more space as the old copy
has to be kept (to preserve the snapshot). If no space is available,
the write operation is refused which Oracle sees as a "corruption".

It is quite well described in TR-3633, pages 22-24.
http://media.netapp.com/documents/tr-3633.pdf
The operations to be performed (after having added space) to recover
from this situation depends on the type of the affected file, the
cases are listed as well in the mentioned document.

It may actually not be linked with auto-extensibility of Oracle
datafiles (which we use extensively without problem). Just updating
redo-log files and/or changing blocks in datafiles can generate the
issue.

Regarding actions to avoid such an issue happening again, it very much
depends on your usage of snapshots:

-1- if the snapshots are not used, removing the snapshots and
disabling the automatic creation of new snapshots is enough!
-2- using the vol autosize feature and making sure that there is
enough space at the aggregate level (and max-autosize not reached) is
a possibile solution (DFM is our friend for this!)
-3- using vol option try_first
http://now.netapp.com/NOW/knowledge/docs/ontap/rel7251/html/ontap/cmdref/man1/na_vol.1.htm
linked with snap autodelete as decribed in
http://now.netapp.com/NOW/knowledge/docs/ontap/rel7251/html/ontap/cmdref/man1/na_snap.1.htm
could be a solution (which we do not use, be careful with your
snapshots if they are a key part of your backup strategy!)

best regards,
eric

2008/12/8 Matthew Zito <mzito [at] gridapp>:
> This is also why you don't enable autoextend on tablespaces, and put your
> archive logs on a separate filesystem. Space management is part of the day
> to day responsibility of a DBA. Running out of space is a Very Bad Thing
> for databases.
> Thanks,
> Matt


jack1729 at gmail

Dec 8, 2008, 4:36 PM

Post #11 of 11 (3019 views)
Permalink
Re: oracle coruption caused by NFS file system is full [In reply to]

Configure and Monitor the autogrow carefully - iit won't notify you
and you could grow your volume until the aggr is full. (Spoken from
experience.:-) )

Also, note that it is not an 'on-demand' growth. The system polls the
size periodically - iirc every minute.

On 12/8/08, Glenn Dekhayser <gdekhayser [at] voyantinc> wrote:
> Just because the database *may* start up OK after such an event, by no means
> should you assume that it *should* start up OK. The ability to come back up
> is completely dependent upon what state the database was in (where it was in
> transactions, etc) when the filesystem filled up. The more active the
> database was when the filesystem choked, the higher likely you'll have
> corruption that you can't recover from.
>
> I completely agree with the assessments of everyone else on the list- NEVER
> let your filesystem fill up.
>
> That being said- you should look at the volume "auto grow" which will
> (sorry) automatically grow the volume when it hits certain thresholds.
> Could prevent disaster in the future.
>
> Glenn Dekhayser
> VoyantStrategies
>
> From: owner-toasters [at] mathworks [mailto:owner-toasters [at] mathworks] On
> Behalf Of Li, Jackie (Yanhui)
> Sent: Friday, December 05, 2008 2:50 PM
> To: Leeds, Daniel; toasters [at] mathworks
> Subject: RE: oracle coruption caused by NFS file system is full
>
> We had experience with SAN attached device, normally the database will just
> start fine after you increase the file system capacity.
>
> ________________________________
> From: Leeds, Daniel [mailto:dleeds [at] edmunds]
> Sent: Friday, December 05, 2008 11:35 AM
> To: Li, Jackie (Yanhui); toasters [at] mathworks
> Subject: RE: oracle coruption caused by NFS file system is full
>
>
>
> this would occur on any storage IMHO. if your filesystem is 100% full it
> cannot write to it and if that filesystem contains things like redo logs,
> archive logs, and/or system tables generally bad things occur.
>
> my first question would be why are they reaching 100% and do you have any
> filesystem monitoring in place? there is no reason your oracle filesystems
> should be reaching 100%
>
> to me this is not a netapp/nfs issue but an operations issue--it would occur
> on nfs, san, or local disk.
>
> --daniel
>
>
> --
> Daniel Leeds
> Manager, Storage Operations
> Edmunds, Inc.
> 1620 26th Street, Suite 400 South
> Santa Monica, CA 90404
>
> 310-309-4999 desk
> 310-430-0536 cell
>
>
>
> -----Original Message-----
> From: owner-toasters [at] mathworks on behalf of Li, Jackie (Yanhui)
> Sent: Fri 12/5/2008 10:33 AM
> To: toasters [at] mathworks
> Subject: oracle coruption caused by NFS file system is full
>
> Our firm runs oracle 9i using NFS file system from Netapp(3040 running 7.2),
> we have noticed few cases now that when the NFS file system is full, oracle
> database crash and can not even start after that.
>
> It turns out that database has logical corruption, does anybody experience
> similar issues?
>
>
>
> Thanks
>

--
Sent from my mobile device

Netapp toasters RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.