Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DRBD: Users

oracle on drbd failed

 

 

DRBD users RSS feed   Index | Next | Previous | View Threaded


xiaozunvlg at gmail

Aug 25, 2012, 8:10 PM

Post #1 of 12 (811 views)
Permalink
oracle on drbd failed

Hi All:

I built a cluster to protect oracle database. The oracle db file
stored on the drbd(8.3.13) device using protocol A. But sometime
oracle can not be failover when the primary node is down. Here is the
testing step

1. node A, B, A is primary node, B is secondary node. oracle run on
node A and excute a SQL to insert lots of data to oracle .
2. on node B, do the following loop to simulate the situation that
node A failed


while [ 0 ] ; do

#broken net link by iptables

#disconnect drbd0 and let it be primary

drbdadm disconnect drbd0
drbdadm primary drbd0

#mount and start oracle
....

#if start failed , break
...

#stop oracle & umount drbd0


#reconnect net link

drbdadm connect drbd0
drbdadm -- --discard-my-data connect drbd0

sleep 5
done


After several loops, oracle can not be started and the following
error occur in alter_<SID>.log



ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr]

or

ORA-00353: log corruption near block 68622 change 39685781 time
08/25/2012 16:06:42


In oracle's metalink , the first error means that there was a power
failure causing logical corruption in controlfile. The second error
means that there was a corruption in redo log file

How can I avoid there errors and let oracle be failover at any time
the primary node crash? Thanks.

BTW: protocol A is needed because the cluster running WAN and using a proxy.
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


xiaozunvlg at gmail

Aug 26, 2012, 10:24 AM

Post #2 of 12 (791 views)
Permalink
Re: oracle on drbd failed [In reply to]

I think the replicate mechanism of protocol A is like Veritas VVR's
SRL async mode. Are there any difference between them? I do not hear
any oracle failed issume abort VVR in our customers.

2012/8/26 Felix Egli <mail [at] felix-egli>:
> Hi
>
> The problem you have, is what can be expected with protocol A. Oracle
> expects that data is commited to the disk, which is in the RAM of your
> node A. I really have no idea in which situation protocol A can be
> useful.
>
> Cheers, Felix
>
> Am 2012-08-26 05:10, schrieb Mia Lueng:
>>
>> Hi All:
>>
>> I built a cluster to protect oracle database. The oracle db file
>> stored on the drbd(8.3.13) device using protocol A. But sometime
>> oracle can not be failover when the primary node is down. Here is the
>> testing step
>>
>> 1. node A, B, A is primary node, B is secondary node. oracle run on
>> node A and excute a SQL to insert lots of data to oracle .
>> 2. on node B, do the following loop to simulate the situation that
>> node A failed
>>
>>
>> while [ 0 ] ; do
>>
>> #broken net link by iptables
>>
>> #disconnect drbd0 and let it be primary
>>
>> drbdadm disconnect drbd0
>> drbdadm primary drbd0
>>
>> #mount and start oracle
>> ....
>>
>> #if start failed , break
>> ...
>>
>> #stop oracle & umount drbd0
>>
>>
>> #reconnect net link
>>
>> drbdadm connect drbd0
>> drbdadm -- --discard-my-data connect drbd0
>>
>> sleep 5
>> done
>>
>>
>> After several loops, oracle can not be started and the following
>> error occur in alter_<SID>.log
>>
>>
>>
>> ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr]
>>
>> or
>>
>> ORA-00353: log corruption near block 68622 change 39685781 time
>> 08/25/2012 16:06:42
>>
>>
>> In oracle's metalink , the first error means that there was a power
>> failure causing logical corruption in controlfile. The second error
>> means that there was a corruption in redo log file
>>
>> How can I avoid there errors and let oracle be failover at any time
>> the primary node crash? Thanks.
>>
>> BTW: protocol A is needed because the cluster running WAN and using
>> a proxy.
>> _______________________________________________
>> drbd-user mailing list
>> drbd-user [at] lists
>> http://lists.linbit.com/mailman/listinfo/drbd-user
>
>
> --
> Felix Egli
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


xiaozunvlg at gmail

Aug 28, 2012, 6:16 PM

Post #3 of 12 (789 views)
Permalink
Re: oracle on drbd failed [In reply to]

another error occurs in oracle :

ORA-00214: control file '/oradata/orcl/control01.ctl' version 79111
inconsistent with file '/oradata/flash_recovery_area/orcl/control02.ctl'
version 79104

Is it sure that protocol A can not keep the oracle's data integration ?



2012/8/27 Mia Lueng <xiaozunvlg [at] gmail>:
> I think the replicate mechanism of protocol A is like Veritas VVR's
> SRL async mode. Are there any difference between them? I do not hear
> any oracle failed issume abort VVR in our customers.
>
> 2012/8/26 Felix Egli <mail [at] felix-egli>:
>> Hi
>>
>> The problem you have, is what can be expected with protocol A. Oracle
>> expects that data is commited to the disk, which is in the RAM of your
>> node A. I really have no idea in which situation protocol A can be
>> useful.
>>
>> Cheers, Felix
>>
>> Am 2012-08-26 05:10, schrieb Mia Lueng:
>>>
>>> Hi All:
>>>
>>> I built a cluster to protect oracle database. The oracle db file
>>> stored on the drbd(8.3.13) device using protocol A. But sometime
>>> oracle can not be failover when the primary node is down. Here is the
>>> testing step
>>>
>>> 1. node A, B, A is primary node, B is secondary node. oracle run on
>>> node A and excute a SQL to insert lots of data to oracle .
>>> 2. on node B, do the following loop to simulate the situation that
>>> node A failed
>>>
>>>
>>> while [ 0 ] ; do
>>>
>>> #broken net link by iptables
>>>
>>> #disconnect drbd0 and let it be primary
>>>
>>> drbdadm disconnect drbd0
>>> drbdadm primary drbd0
>>>
>>> #mount and start oracle
>>> ....
>>>
>>> #if start failed , break
>>> ...
>>>
>>> #stop oracle & umount drbd0
>>>
>>>
>>> #reconnect net link
>>>
>>> drbdadm connect drbd0
>>> drbdadm -- --discard-my-data connect drbd0
>>>
>>> sleep 5
>>> done
>>>
>>>
>>> After several loops, oracle can not be started and the following
>>> error occur in alter_<SID>.log
>>>
>>>
>>>
>>> ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr]
>>>
>>> or
>>>
>>> ORA-00353: log corruption near block 68622 change 39685781 time
>>> 08/25/2012 16:06:42
>>>
>>>
>>> In oracle's metalink , the first error means that there was a power
>>> failure causing logical corruption in controlfile. The second error
>>> means that there was a corruption in redo log file
>>>
>>> How can I avoid there errors and let oracle be failover at any time
>>> the primary node crash? Thanks.
>>>
>>> BTW: protocol A is needed because the cluster running WAN and using
>>> a proxy.
>>> _______________________________________________
>>> drbd-user mailing list
>>> drbd-user [at] lists
>>> http://lists.linbit.com/mailman/listinfo/drbd-user
>>
>>
>> --
>> Felix Egli
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


ff at mpexnet

Aug 29, 2012, 12:37 AM

Post #4 of 12 (782 views)
Permalink
Re: oracle on drbd failed [In reply to]

On 08/29/2012 03:16 AM, Mia Lueng wrote:
> Is it sure that protocol A can not keep the oracle's data integration ?

Protocol A cannot guarantee data integrity of *anything* in case of
connectivity issues.

Rule of thumb: When DRBD is backing your transactional database and you
care not to lose transactions - use protocol C.

HTH,
Felix (a different one)
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


ff at mpexnet

Aug 29, 2012, 6:00 AM

Post #5 of 12 (781 views)
Permalink
Re: oracle on drbd failed [In reply to]

On 08/29/2012 02:51 PM, Mia Lueng wrote:
> I don't think so , Protocol A can still keep the data write order on
> secondary node just as on primary node. It just link system crashed
> on primary node and resume when the node restart.

Well, a primary crash is handled rather gracefully even with protocol A.

The trouble is this: The Oracle process on your primary basically has no
idea what has and has not been written to the peer's disk, because the
kernel gives feedback only concerning writes to the primary's disk.

> 2012/8/29 Felix Frank <ff [at] mpexnet>:
>> > On 08/29/2012 03:16 AM, Mia Lueng wrote:
>> >
>> > Protocol A cannot guarantee data integrity of *anything* in case of
>> > connectivity issues.
>> >
>> > Rule of thumb: When DRBD is backing your transactional database and you
>> > care not to lose transactions - use protocol C.
>> >
>> > HTH,
>> > Felix (a different one)
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


ff at mpexnet

Aug 29, 2012, 7:36 AM

Post #6 of 12 (778 views)
Permalink
Re: oracle on drbd failed [In reply to]

On 08/29/2012 04:16 PM, Mia Lueng wrote:
> As my consideration, since the order that data write to disk is kept,
> even if the later data block is missed, oracle have its own way to
> recover these data.

Let's assume that you can get away with some lost transactions then.
Look at what you're doing:

On 08/26/2012 05:10 AM, Mia Lueng wrote:
> drbdadm -- --discard-my-data connect drbd0

I suppose you put that in there for a reason. Please note that by doing
this automatically, you sign yourself up for data loss.
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


ff at mpexnet

Aug 30, 2012, 12:38 AM

Post #7 of 12 (771 views)
Permalink
Re: oracle on drbd failed [In reply to]

On 08/29/2012 05:12 PM, Mia Lueng wrote:
> I think you just misunderstood me. The key action for this test is
>
> drbdadm disconnect
> drbdadm primary
>
> which simulate the situation that the primary is crashed to test if
> the oracle can be fail over on secondary node
>
> drbdadm --discard-my-data connect drbd0
>
> the action just keep the secondary's data sync with the primary data
> for the next test.

...assuming the primary had not accumulated some minor corruptions
during an earlier loop iteration.

But you may be right and me on the whole wrong track. Maybe someone else
has a better idea of what may be going on.
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


ff at mpexnet

Aug 30, 2012, 12:53 AM

Post #8 of 12 (768 views)
Permalink
Re: oracle on drbd failed [In reply to]

On 08/30/2012 09:38 AM, Felix Frank wrote:
>> I think you just misunderstood me. The key action for this test is
>> >
>> > drbdadm disconnect
>> > drbdadm primary
>> >
>> > which simulate the situation that the primary is crashed to test if
>> > the oracle can be fail over on secondary node
>> >
>> > drbdadm --discard-my-data connect drbd0
>> >
>> > the action just keep the secondary's data sync with the primary data
>> > for the next test.
> ...assuming the primary had not accumulated some minor corruptions
> during an earlier loop iteration.

Which reminds me: After failing a protocol A resource, it's important to
perform a verify.

Oracle *will* clean up any mess on the new primary, but without a full
sync back, you cannot be entirely sure that the old primary does not
retain any old writes that hadn't made it to the new primary. The
activity log is supposed to protect you from this, but I disbelieve it
can keep you 100% safe.

Cheers,
Felix
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


xiaozunvlg at gmail

Sep 2, 2012, 1:45 PM

Post #9 of 12 (730 views)
Permalink
Re: oracle on drbd failed [In reply to]

I use drbd_trace to trace drbd write operation when running oracle,
it show info like this;

block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:2152:
drbd0_worker [5323] data >>> Barrier (barrier 435610040)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_receiver.c:5005:
drbd0_asender [11122] meta <<< BarrierAck (barrier 435610037)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_receiver.c:5005:
drbd0_asender [11122] meta <<< BarrierAck (barrier 435610038)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_receiver.c:5005:
drbd0_asender [11122] meta <<< BarrierAck (barrier 435610039)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_receiver.c:5005:
drbd0_asender [11122] meta <<< BarrierAck (barrier 435610040)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:3062:
drbd0_worker [5323] data >>> Data (sector 1b5000s, offset=36a00000, id
ffff880080cadd68, seq 30957, size=1000, f 2)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:3062:
drbd0_worker [5323] data >>> Data (sector 1b5008s, offset=36a01000, id
ffff880080cad438, seq 30958, size=1000, f 2)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:3062:
drbd0_worker [5323] data >>> Data (sector 1b5010s, offset=36a02000, id
ffff880080cad4a8, seq 30959, size=1000, f 2)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:3062:
drbd0_worker [5323] data >>> Data (sector 1b5018s, offset=36a03000, id
ffff880080cadcf8, seq 30960, size=1000, f 2)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:2152:
drbd0_worker [5323] data >>> UnplugRemote (7)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:2152:
drbd0_worker [5323] data >>> Barrier (barrier 435610041)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:3062:
drbd0_worker [5323] data >>> Data (sector 0s, offset=0, id
ffff880080cad0b8, seq 30961, size=0, f 2a)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:3062:
drbd0_worker [5323] data >>> Data (sector 0s, offset=0, id
ffff880080cad358, seq 30962, size=0, f 2a)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:3062:
drbd0_worker [5323] data >>> Data (sector 0s, offset=0, id
ffff880080cad128, seq 30963, size=0, f 2a)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:2152:
drbd0_worker [5323] data >>> Barrier (barrier 435610042)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:3062:
drbd0_worker [5323] data >>> Data (sector 46f80s, offset=8df0000, id
ffff880080cad3c8, seq 30964, size=1000, f 2)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:3062:
drbd0_worker [5323] data >>> Data (sector 46f88s, offset=8df1000, id
ffff880080cade48, seq 30965, size=1000, f 2)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:3062:
drbd0_worker [5323] data >>> Data (sector 46f90s, offset=8df2000, id
ffff880080cad908, seq 30966, size=1000, f 2)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:3062:
drbd0_worker [5323] data >>> Data (sector 46f98s, offset=8df3000, id
ffff880080cad898, seq 30967, size=1000, f 2)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:2152:
drbd0_worker [5323] data >>> UnplugRemote (7)
block drbd0: /root/rpmbuild/BUILD/drbd-8.3.13/drbd/drbd_main.c:2152:
drbd0_worker [5323] data >>> Barrier (barrier 435610043)

It's obvious that oracle instance write block at size
4*0x01000=4*4096. Is it possible the fail occured when the secondary
node can not recv and write the full 4*4096 block at network failure?

If it's true, how to handle this situation?

2012/8/30 Felix Frank <ff [at] mpexnet>:
> On 08/30/2012 09:38 AM, Felix Frank wrote:
>>> I think you just misunderstood me. The key action for this test is
>>> >
>>> > drbdadm disconnect
>>> > drbdadm primary
>>> >
>>> > which simulate the situation that the primary is crashed to test if
>>> > the oracle can be fail over on secondary node
>>> >
>>> > drbdadm --discard-my-data connect drbd0
>>> >
>>> > the action just keep the secondary's data sync with the primary data
>>> > for the next test.
>> ...assuming the primary had not accumulated some minor corruptions
>> during an earlier loop iteration.
>
> Which reminds me: After failing a protocol A resource, it's important to
> perform a verify.
>
> Oracle *will* clean up any mess on the new primary, but without a full
> sync back, you cannot be entirely sure that the old primary does not
> retain any old writes that hadn't made it to the new primary. The
> activity log is supposed to protect you from this, but I disbelieve it
> can keep you 100% safe.
>
> Cheers,
> Felix
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


lars.ellenberg at linbit

Sep 3, 2012, 3:19 AM

Post #10 of 12 (721 views)
Permalink
Re: oracle on drbd failed [In reply to]

On Sun, Aug 26, 2012 at 11:10:44AM +0800, Mia Lueng wrote:
> Hi All:
>
> I built a cluster to protect oracle database. The oracle db file
> stored on the drbd(8.3.13) device using protocol A. But sometime
> oracle can not be failover when the primary node is down. Here is the
> testing step


Please show the drbd configuration
(drbdadm dump, or even better, drbdsetup 0 show)
and cat /proc/drbd.

Also, what is the kernel version, distribution/platform?
What does the rest of the IO stack look like?

> How can I avoid there errors and let oracle be failover at any time
> the primary node crash? Thanks.
>
> BTW: protocol A is needed because the cluster running WAN and using a proxy.

Proxy, as in "drbd-proxy"?
Why not contact your LINBIT support channel, then?

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


xiaozunvlg at gmail

Sep 3, 2012, 7:59 AM

Post #11 of 12 (726 views)
Permalink
Re: oracle on drbd failed [In reply to]

resource drbd0{
protocol A;

disk
{
on-io-error pass_on;
no-disk-barrier;
no-disk-flushes;
}

syncer
{
rate 100M;
csums-alg md5;
verify-alg md5;
# c-plan-ahead 20;
c-fill-target 0;
# c-delay-target 30;
# c-max-rate 200M;
# c-min-rate 4M;
}

net
{
# on-congestion pull-ahead;
# congestion-fill 128M;
ping-timeout 30;
ping-int 30;
data-integrity-alg crc32c;
}

on "kvm3.hgccp" {
device /dev/drbd0;
disk /dev/vg_kvm3/drbd0;
address 192.168.10.6:7700;
meta-disk internal;

}

on "kvm4.hgccp" {
device /dev/drbd0;
disk /dev/vg_kvm4/drbd0;
address 192.168.10.7:7700;
meta-disk internal;

}
}

os: rhel 6.3 x86_64, kernel version is 2.6.32-220.el6.x86_64

now we have not used proxy yet. I just test this on local lan
environment. If the test pass, we will install it on WAN enviroment.

2012/9/3 Lars Ellenberg <lars.ellenberg [at] linbit>:
> On Sun, Aug 26, 2012 at 11:10:44AM +0800, Mia Lueng wrote:
>> Hi All:
>>
>> I built a cluster to protect oracle database. The oracle db file
>> stored on the drbd(8.3.13) device using protocol A. But sometime
>> oracle can not be failover when the primary node is down. Here is the
>> testing step
>
>
> Please show the drbd configuration
> (drbdadm dump, or even better, drbdsetup 0 show)
> and cat /proc/drbd.
>
> Also, what is the kernel version, distribution/platform?
> What does the rest of the IO stack look like?
>
>> How can I avoid there errors and let oracle be failover at any time
>> the primary node crash? Thanks.
>>
>> BTW: protocol A is needed because the cluster running WAN and using a proxy.
>
> Proxy, as in "drbd-proxy"?
> Why not contact your LINBIT support channel, then?
>
> --
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
> _______________________________________________
> drbd-user mailing list
> drbd-user [at] lists
> http://lists.linbit.com/mailman/listinfo/drbd-user
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


lars.ellenberg at linbit

Sep 4, 2012, 1:48 AM

Post #12 of 12 (719 views)
Permalink
Re: oracle on drbd failed [In reply to]

On Mon, Sep 03, 2012 at 10:59:12PM +0800, Mia Lueng wrote:
> resource drbd0{
> protocol A;

Does it reproduce with different protocols as well?

> disk
> {
> on-io-error pass_on;

Certainly not.
on-io-error detach;
please.

> no-disk-barrier;
> no-disk-flushes;
> }

Did you verify that the config file (drbdadm dump),
and the kernel config (drbdsetup show) match?
On both nodes?

You confirm using 8.3.13 on both nodes?

> syncer
> {
> rate 100M;
> csums-alg md5;

You in theory could have md5 collisions.
Does it reproduce without csums-alg?

> verify-alg md5;

Use a verify-alg != csums-alg, (sha1 maybe)
change your testing method to stop services,
and do a verify after resync, while idle.
Then low-level compare
(dd iflag=direct bs=.. skip=... count=... ... | xxd)
the blocks that differ,
so we get an idea of what is actually different.

> # c-plan-ahead 20;
> c-fill-target 0;
> # c-delay-target 30;
> # c-max-rate 200M;
> # c-min-rate 4M;
> }
>
> net
> {
> # on-congestion pull-ahead;
> # congestion-fill 128M;

Sorry for the "Did you try turning it off and on again" question, but,
did you at any point force-primary something?
If so, the data was simply not consistent,
and thus higher level data consistency errors would be expected.

> ping-timeout 30;
> ping-int 30;
> data-integrity-alg crc32c;
> }
>
> on "kvm3.hgccp" {
> device /dev/drbd0;
> disk /dev/vg_kvm3/drbd0;
> address 192.168.10.6:7700;
> meta-disk internal;
>
> }
>
> on "kvm4.hgccp" {
> device /dev/drbd0;
> disk /dev/vg_kvm4/drbd0;
> address 192.168.10.7:7700;
> meta-disk internal;
>
> }
> }
>
> os: rhel 6.3 x86_64, kernel version is 2.6.32-220.el6.x86_64
>
> now we have not used proxy yet. I just test this on local lan
> environment. If the test pass, we will install it on WAN enviroment.

> 2012/9/3 Lars Ellenberg <lars.ellenberg [at] linbit>:
> > On Sun, Aug 26, 2012 at 11:10:44AM +0800, Mia Lueng wrote:
> >> Hi All:
> >>
> >> I built a cluster to protect oracle database. The oracle db file
> >> stored on the drbd(8.3.13) device using protocol A. But sometime
> >> oracle can not be failover when the primary node is down. Here is the
> >> testing step


Once you get the failure, is that persistent,
or does it work again after the next resync?

Does drbd online verify while idle find differing blocks?
See above, if so, low-level compare them.

> > Please show the drbd configuration
> > (drbdadm dump, or even better, drbdsetup 0 show)
> > and cat /proc/drbd.
> >
> > Also, what is the kernel version, distribution/platform?
> > What does the rest of the IO stack look like?
> >
> >> How can I avoid there errors and let oracle be failover at any time
> >> the primary node crash? Thanks.

"Works here".

> >> BTW: protocol A is needed because the cluster running WAN and using a proxy.
> >
> > Proxy, as in "drbd-proxy"?
> > Why not contact your LINBIT support channel, then?

^^ That would still an option...

;)

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list -- I'm subscribed
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user

DRBD users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.