Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DRBD: Users

DRBD MC - unstable cluster with latest corosync and pacemaker

 

 

DRBD users RSS feed   Index | Next | Previous | View Threaded


JSammons at cds-global

Jul 23, 2010, 1:44 PM

Post #1 of 6 (1655 views)
Permalink
DRBD MC - unstable cluster with latest corosync and pacemaker

Environment:

CentOS 5.5
DRBD 8.3.8
Corosync 1.2.5
Pacemaker 1.0.9

Whenever I use DRBD MC to manage the configuration above, I run into stability problems with the cluster. Here is some of the behavior I have observed:

1. Corosync crash across cluster members. When I try to restart corosync I receive a Bus Error and cannot restart it until I reboot the system.
2. Frequent disconnects and reconnects across the cluster members.
3. Unresponsive cluster. Whenver I try to manage the cluster in DRBD MC commands are ignored.

When I manage the cluster from the command line without using the MC everything works fine. Also if I use the following stack versions everything seems to be fine:

CentOS 5.5
DRBD 8.3.8
Corosync 1.2.0
Pacemaker 1.0.8

It seems to be an issue with DRBD MC and the latest Pacemaker and Corosync versions.

Thank you,
Jamie Sammons



---------------------------------------------------------

This e-mail message is intended only for the personal use of the recipient(s)
named above. If you are not an intended recipient, you may not review, copy or
distribute this message. If you have received this communication in error,
please notify the CDS Global Help Desk (cdshelpdesk [at] cds-global) immediately
by e-mail and delete the original message.

---------------------------------------------------------


rasto.levrinc at linbit

Jul 24, 2010, 5:09 AM

Post #2 of 6 (1578 views)
Permalink
Re: DRBD MC - unstable cluster with latest corosync and pacemaker [In reply to]

On Fri, July 23, 2010 10:44 pm, Jamie L Sammons wrote:
> Environment:
>
>
> CentOS 5.5
> DRBD 8.3.8
> Corosync 1.2.5
> Pacemaker 1.0.9
>
>
> Whenever I use DRBD MC to manage the configuration above, I run into
> stability problems with the cluster. Here is some of the behavior I have
> observed:
>
>
> 1. Corosync crash across cluster members. When I try to restart corosync
> I receive a Bus Error and cannot restart it until I reboot the system.
> 2. Frequent disconnects and reconnects across the cluster members.
> 3. Unresponsive cluster. Whenver I try to manage the cluster in DRBD MC
> commands are ignored.

Yes, the problem is that 'corosync -v' leaks shared memory in Corosync
1.2.5 and it happens that DRBD MC runs this command regularly to find out
the version. Reboot clears the shared memory.

For now you can remove this line from drbd-gui-helper:
($cs_version) = `$cs_prog -v` =~ /'(\d+\.\d+\.\d+)'/;

and run DMC with --keep-helper option, or use different Corosync version.

Rasto

--
: Dipl-Ing Rastislav Levrinc
: DRBD-MC http://www.drbd.org/mc/management-console/
: DRBD/HA support and consulting http://www.linbit.com/
DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.


_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


lars.ellenberg at linbit

Jul 24, 2010, 5:52 AM

Post #3 of 6 (1575 views)
Permalink
Re: DRBD MC - unstable cluster with latest corosync and pacemaker [In reply to]

On Sat, Jul 24, 2010 at 02:09:47PM +0200, Rasto Levrinc wrote:
>
> On Fri, July 23, 2010 10:44 pm, Jamie L Sammons wrote:
> > Environment:
> >
> >
> > CentOS 5.5
> > DRBD 8.3.8
> > Corosync 1.2.5
> > Pacemaker 1.0.9
> >
> >
> > Whenever I use DRBD MC to manage the configuration above, I run into
> > stability problems with the cluster. Here is some of the behavior I have
> > observed:
> >
> >
> > 1. Corosync crash across cluster members. When I try to restart corosync
> > I receive a Bus Error and cannot restart it until I reboot the system.
> > 2. Frequent disconnects and reconnects across the cluster members.
> > 3. Unresponsive cluster. Whenver I try to manage the cluster in DRBD MC
> > commands are ignored.
>
> Yes, the problem is that 'corosync -v' leaks shared memory in Corosync
> 1.2.5 and it happens that DRBD MC runs this command regularly to find out
> the version. Reboot clears the shared memory.
>
> For now you can remove this line from drbd-gui-helper:
> ($cs_version) = `$cs_prog -v` =~ /'(\d+\.\d+\.\d+)'/;
>
> and run DMC with --keep-helper option, or use different Corosync version.

I'd recommend to use heartbeat as the cluster communication layer.

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list -- I'm subscribed
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


JSammons at cds-global

Jul 24, 2010, 7:57 AM

Post #4 of 6 (1585 views)
Permalink
Re: DRBD MC - unstable cluster with latest corosync and pacemaker [In reply to]

I was considering using Heartbeat as well since this issue is isolated to Corosync.

It also appears that Corosync 1.2.6 is out which may also address the issue:

* Fix problem where flight recorder leaks files in shared memory filesystem. Also clean up the error handling of the shared memory allocation code of the flight recorder.
Thank you,
Jamie Sammons

-----drbd-user-bounces [at] lists wrote: -----

To: drbd-user [at] lists
From: Lars Ellenberg <lars.ellenberg [at] linbit>
Sent by: drbd-user-bounces [at] lists
Date: 07/24/2010 07:52AM
Subject: Re: [DRBD-user] DRBD MC - unstable cluster with latest corosync and pacemaker

On Sat, Jul 24, 2010 at 02:09:47PM +0200, Rasto Levrinc wrote:
>
> On Fri, July 23, 2010 10:44 pm, Jamie L Sammons wrote:
> > Environment:
> >
> >
> > CentOS 5.5
> > DRBD 8.3.8
> > Corosync 1.2.5
> > Pacemaker 1.0.9
> >
> >
> > Whenever I use DRBD MC to manage the configuration above, I run into
> > stability problems with the cluster. Here is some of the behavior I have
> > observed:
> >
> >
> > 1. Corosync crash across cluster members. When I try to restart corosync
> > I receive a Bus Error and cannot restart it until I reboot the system.
> > 2. Frequent disconnects and reconnects across the cluster members.
> > 3. Unresponsive cluster. Whenver I try to manage the cluster in DRBD MC
> > commands are ignored.
>
> Yes, the problem is that 'corosync -v' leaks shared memory in Corosync
> 1.2.5 and it happens that DRBD MC runs this command regularly to find out
> the version. Reboot clears the shared memory.
>
> For now you can remove this line from drbd-gui-helper:
> ($cs_version) = `$cs_prog -v` =~ /'(\d+\.\d+\.\d+)'/;
>
> and run DMC with --keep-helper option, or use different Corosync version.

I'd recommend to use heartbeat as the cluster communication layer.

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com"]http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list -- I'm subscribed
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user"]http://lists.linbit.com/mailman/listinfo/drbd-user





---------------------------------------------------------

This e-mail message is intended only for the personal use of the recipient(s)
named above. If you are not an intended recipient, you may not review, copy or
distribute this message. If you have received this communication in error,
please notify the CDS Global Help Desk (cdshelpdesk [at] cds-global) immediately
by e-mail and delete the original message.

---------------------------------------------------------


rasto.levrinc at linbit

Jul 24, 2010, 4:26 PM

Post #5 of 6 (1566 views)
Permalink
Re: DRBD MC - unstable cluster with latest corosync and pacemaker [In reply to]

On Sat, July 24, 2010 4:57 pm, Jamie L Sammons wrote:

> It also appears that Corosync 1.2.6 is out which may also address the
> issue:
>
>
> * Fix problem where flight recorder leaks files in shared memory
> filesystem. Also clean up the error handling of the shared memory
> allocation code of the flight recorder.

Yeah, I saw that, but I wasn't sure if flight recorder is needed to print
out the version. Now I know. :) Anyway it's been confirmed that it's been
fixed in 1.2.6.

I've just released DRBD MC 0.7.7, so that it works with this Corosync
version.

Thanks for helping,

Rasto

--
: Dipl-Ing Rastislav Levrinc
: DRBD-MC http://www.drbd.org/mc/management-console/
: DRBD/HA support and consulting http://www.linbit.com/
DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.


_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


JSammons at cds-global

Jul 24, 2010, 5:43 PM

Post #6 of 6 (1568 views)
Permalink
Re: DRBD MC - unstable cluster with latest corosync and pacemaker [In reply to]

I also posted a message on the clusterlabs mailing list to see if they had an ETA on when 1.2.6 might be packaged.

Glad to be of help.

Thank you,
Jamie Sammons

-----drbd-user-bounces [at] lists wrote: -----

To: drbd-user [at] lists
From: "Rasto Levrinc" <rasto.levrinc [at] linbit>
Sent by: drbd-user-bounces [at] lists
Date: 07/24/2010 06:26PM
Subject: Re: [DRBD-user] DRBD MC - unstable cluster with latest corosync and pacemaker

On Sat, July 24, 2010 4:57 pm, Jamie L Sammons wrote:

> It also appears that Corosync 1.2.6 is out which may also address the
> issue:
>
>
> * Fix problem where flight recorder leaks files in shared memory
> filesystem. Also clean up the error handling of the shared memory
> allocation code of the flight recorder.

Yeah, I saw that, but I wasn't sure if flight recorder is needed to print
out the version. Now I know. :) Anyway it's been confirmed that it's been
fixed in 1.2.6.

I've just released DRBD MC 0.7.7, so that it works with this Corosync
version.

Thanks for helping,

Rasto

--
: Dipl-Ing Rastislav Levrinc
: DRBD-MC http://www.drbd.org/mc/management-console/"]http://www.drbd.org/mc/management-console/
: DRBD/HA support and consulting http://www.linbit.com/"]http://www.linbit.com/
DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.


_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user"]http://lists.linbit.com/mailman/listinfo/drbd-user





---------------------------------------------------------

This e-mail message is intended only for the personal use of the recipient(s)
named above. If you are not an intended recipient, you may not review, copy or
distribute this message. If you have received this communication in error,
please notify the CDS Global Help Desk (cdshelpdesk [at] cds-global) immediately
by e-mail and delete the original message.

---------------------------------------------------------

DRBD users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.