Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Dev

[PATCH] IPv6addr: Taking precautions against SplitBrain

 

 

Linux-HA dev RSS feed   Index | Next | Previous | View Threaded


inouekazu at intellilink

Aug 3, 2009, 5:55 PM

Post #1 of 5 (1399 views)
Permalink
[PATCH] IPv6addr: Taking precautions against SplitBrain

Hi lists,

I'm posting another patch for IPv6addr (changeset:cf020d609b57).
Its role is the following.

Check whether the VIP is available _before_ assigning it in start
operation, and if the address is already available, exit with error.
This behavior is to take precautions against SplitBrain.
With the former behavior, when SplitBrain occurs,
though it's a fraction of a second,
same VIPs are assigned on two or more nodes at the same time.


Any comments and suggestions are really appreciated.


Best Regards,
Kazunori INOUE
Attachments: IPv6addr_check_assgined_VIP.patch (15.4 KB)


inouekazu at intellilink

Aug 11, 2009, 3:09 AM

Post #2 of 5 (1229 views)
Permalink
Re: [PATCH] IPv6addr: Taking precautions against SplitBrain [In reply to]

Hi,

We think that there is a possibility that the network connection of the
ACT side becomes interrupted because same VIP was assigned on two nodes
or more at the same time.

We would like to hear your opinions.

Best Regards,
Kazunori INOUE


Kazunori INOUE wrote:
> Hi lists,
>
> I'm posting another patch for IPv6addr (changeset:cf020d609b57).
> Its role is the following.
>
> Check whether the VIP is available _before_ assigning it in start
> operation, and if the address is already available, exit with error.
> This behavior is to take precautions against SplitBrain.
> With the former behavior, when SplitBrain occurs,
> though it's a fraction of a second,
> same VIPs are assigned on two or more nodes at the same time.
>
>
> Any comments and suggestions are really appreciated.
>
>
> Best Regards,
> Kazunori INOUE
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


dejanmm at fastmail

Aug 12, 2009, 7:03 AM

Post #3 of 5 (1241 views)
Permalink
Re: [PATCH] IPv6addr: Taking precautions against SplitBrain [In reply to]

Hi Kazunori-san,

On Tue, Aug 04, 2009 at 09:55:04AM +0900, Kazunori INOUE wrote:
> Hi lists,
>
> I'm posting another patch for IPv6addr (changeset:cf020d609b57).
> Its role is the following.
>
> Check whether the VIP is available _before_ assigning it in start
> operation, and if the address is already available, exit with error.

How does this fare with starting an already started resource?

> This behavior is to take precautions against SplitBrain.
> With the former behavior, when SplitBrain occurs,
> though it's a fraction of a second,
> same VIPs are assigned on two or more nodes at the same time.

That shouldn't be happening in configurations with proper
fencing setup. Or am I missing something specific to IPv6addr?

Split brain problems should be taken care of on a different
level, not by resource agents.

Thanks,

Dejan

> Any comments and suggestions are really appreciated.
>
>
> Best Regards,
> Kazunori INOUE
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


inouekazu at intellilink

Aug 14, 2009, 3:17 AM

Post #4 of 5 (1214 views)
Permalink
Re: [PATCH] IPv6addr: Taking precautions against SplitBrain [In reply to]

Hi Dejan,

Thank you for your reply!

Dejan Muhamedagic wrote:
> Hi Kazunori-san,
>
> On Tue, Aug 04, 2009 at 09:55:04AM +0900, Kazunori INOUE wrote:
>> Hi lists,
>>
>> I'm posting another patch for IPv6addr (changeset:cf020d609b57).
>> Its role is the following.
>>
>> Check whether the VIP is available _before_ assigning it in start
>> operation, and if the address is already available, exit with error.
>
> How does this fare with starting an already started resource?

What do you mean by "an already started resource" here?
Is it started IPv6addr?
In that case, before IPv6addr assigns the address, it sends a ICMP
ECHO_REQUEST to the address which it is going to assign, and only
checks a result.
I think that there is no influence in the resource that has already
started. (resource means the address which has been assigned with
IPv6addr.)

I'd like to explain the summary behavior of attached patch.

1) before SplitBrain occurs.
NodeA NodeB
============================= ==================================
IPv6addr: Started IPv6addr: -
* 2001:ffff::1:1 is assigned

2) when SplitBrain occurs.
NodeA NodeB
============================= ==================================
IPv6addr: Started IPv6addr: starts 'start operation'
* 2001:ffff::1:1 is assigned * in start operation,
before assigning address, send
ECHO_REQUEST to 2001:ffff::1:1.
in this case, since it receives
ECHO_RESPONSE,
address is NOT assigned.
3) after SplitBrain occurs.
NodeA NodeB
============================= ==================================
IPv6addr: Started IPv6addr: 'start operation' failed
* 2001:ffff::1:1 is assigned

The same address is never assigned on two nodes at the same time.
Did I misunderstand your question?

>
>> This behavior is to take precautions against SplitBrain.
>> With the former behavior, when SplitBrain occurs,
>> though it's a fraction of a second,
>> same VIPs are assigned on two or more nodes at the same time.
>
> That shouldn't be happening in configurations with proper
> fencing setup. Or am I missing something specific to IPv6addr?
>
> Split brain problems should be taken care of on a different
> level, not by resource agents.
We think so, too.
but... behavior of the _present_ IPv6addr and Heartbeat is
i) In start operation of IPv6addr, address is _assigned_
with assign_addr6().
ii) And then, the result is checked with is_addr6_available().
When the same address has already been assigned in other
nodes, the result becomes failure.
-> IPv6addr returns OCF_ERR_GENERIC.
iii) Thereby, Heartbeat performs stop operation of IPv6addr,
and address is _unassigned_.

When it turns out beforehand to assign same address,
I think that it is a needless assignment.

Regards,
Kazunori INOUE

>
> Thanks,
>
> Dejan
>
>> Any comments and suggestions are really appreciated.
>>
>>
>> Best Regards,
>> Kazunori INOUE
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


dejanmm at fastmail

Aug 14, 2009, 7:56 AM

Post #5 of 5 (1196 views)
Permalink
Re: [PATCH] IPv6addr: Taking precautions against SplitBrain [In reply to]

Hi,

On Fri, Aug 14, 2009 at 07:17:42PM +0900, Kazunori INOUE wrote:
> Hi Dejan,
>
> Thank you for your reply!
>
> Dejan Muhamedagic wrote:
> > Hi Kazunori-san,
> >
> > On Tue, Aug 04, 2009 at 09:55:04AM +0900, Kazunori INOUE wrote:
> >> Hi lists,
> >>
> >> I'm posting another patch for IPv6addr (changeset:cf020d609b57).
> >> Its role is the following.
> >>
> >> Check whether the VIP is available _before_ assigning it in start
> >> operation, and if the address is already available, exit with error.
> >
> > How does this fare with starting an already started resource?
>
> What do you mean by "an already started resource" here?
> Is it started IPv6addr?

Yes.

> In that case, before IPv6addr assigns the address, it sends a ICMP
> ECHO_REQUEST to the address which it is going to assign, and only
> checks a result.

Wouldn't that return with success?

> I think that there is no influence in the resource that has already
> started. (resource means the address which has been assigned with
> IPv6addr.)
>
> I'd like to explain the summary behavior of attached patch.
>
> 1) before SplitBrain occurs.
> NodeA NodeB
> ============================= ==================================
> IPv6addr: Started IPv6addr: -
> * 2001:ffff::1:1 is assigned
>
> 2) when SplitBrain occurs.
> NodeA NodeB
> ============================= ==================================
> IPv6addr: Started IPv6addr: starts 'start operation'
> * 2001:ffff::1:1 is assigned * in start operation,
> before assigning address, send
> ECHO_REQUEST to 2001:ffff::1:1.
> in this case, since it receives
> ECHO_RESPONSE,
> address is NOT assigned.
> 3) after SplitBrain occurs.
> NodeA NodeB
> ============================= ==================================
> IPv6addr: Started IPv6addr: 'start operation' failed
> * 2001:ffff::1:1 is assigned
>
> The same address is never assigned on two nodes at the same time.
> Did I misunderstand your question?

No.

> >> This behavior is to take precautions against SplitBrain.
> >> With the former behavior, when SplitBrain occurs,
> >> though it's a fraction of a second,
> >> same VIPs are assigned on two or more nodes at the same time.
> >
> > That shouldn't be happening in configurations with proper
> > fencing setup. Or am I missing something specific to IPv6addr?
> >
> > Split brain problems should be taken care of on a different
> > level, not by resource agents.
> We think so, too.
> but... behavior of the _present_ IPv6addr and Heartbeat is
> i) In start operation of IPv6addr, address is _assigned_
> with assign_addr6().
> ii) And then, the result is checked with is_addr6_available().
> When the same address has already been assigned in other
> nodes, the result becomes failure.
> -> IPv6addr returns OCF_ERR_GENERIC.
> iii) Thereby, Heartbeat performs stop operation of IPv6addr,
> and address is _unassigned_.
>
> When it turns out beforehand to assign same address,
> I think that it is a needless assignment.

Still, a resource agent doesn't have enough information to deal
with split brains. Though this RA doesn't support clones yet, it
might in future and this patch would break them. None of OCF
resource agents is taking care of split brain. I don't see why
should we make an exception here. Sorry.

Thanks,

Dejan

> Regards,
> Kazunori INOUE
>
> >
> > Thanks,
> >
> > Dejan
> >
> >> Any comments and suggestions are really appreciated.
> >>
> >>
> >> Best Regards,
> >> Kazunori INOUE
> > _______________________________________________________
> > Linux-HA-Dev: Linux-HA-Dev [at] lists
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > Home Page: http://linux-ha.org/
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Linux-HA dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.