Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DRBD: Users

Problem to use DRBD with LVM controlled by Pacemaker

 

 

DRBD users RSS feed   Index | Next | Previous | View Threaded


kami911 at gmail

Jun 29, 2012, 8:09 AM

Post #1 of 7 (864 views)
Permalink
Problem to use DRBD with LVM controlled by Pacemaker

Hi Folks,

I am builing KVM with DRBD storage based on Linbit's documentation. I
created DRBD and it seems works well.

However I added all to a cluster (Pacemaker/Corosync), and I got error
during start:

crm_mon:
Online: [ ss1 ss2 ]

Master/Slave Set: ms_drbd_iscsivg01 [res_drbd_iscsivg01]
Masters: [ ss2 ]
Slaves: [ ss1 ]

Failed actions:
res_lvm_iscsivg01_start_0 (node=ss1, call=29, rc=7,
status=complete): not running
res_lvm_iscsivg01_start_0 (node=ss2, call=34, rc=7,
status=complete): not running

grep -i lvm /var/log/syslog

Jun 29 12:34:53 ss2 LVM[2386]: INFO: 0 logical volume(s) in volume
group "storage" now active
Jun 29 12:34:53 ss2 LVM[2386]: ERROR: LVM: storage did not activate correctly
Jun 29 12:34:53 ss2 lrmd: [1092]: info: operation start[20] on
res_lvm_iscsivg01 for client 1095: pid 2386 exited with return code 7
Jun 29 12:34:53 ss2 crmd: [1095]: info: process_lrm_event: LRM
operation res_lvm_iscsivg01_start_0 (call=20, rc=7, cib-update=26,
confirmed=true) not running
Jun 29 12:34:53 ss2 attrd: [1093]: notice: attrd_trigger_update:
Sending flush op to all hosts for: fail-count-res_lvm_iscsivg01
(INFINITY)
Jun 29 12:34:53 ss2 attrd: [1093]: notice: attrd_perform_update: Sent
update 43: fail-count-res_lvm_iscsivg01=INFINITY
Jun 29 12:34:53 ss2 attrd: [1093]: notice: attrd_trigger_update:
Sending flush op to all hosts for: last-failure-res_lvm_iscsivg01
(1340966093)
Jun 29 12:34:53 ss2 attrd: [1093]: notice: attrd_perform_update: Sent
update 45: last-failure-res_lvm_iscsivg01=1340966093
Jun 29 12:34:53 ss2 crmd: [1095]: info: do_lrm_rsc_op: Performing
key=2:19:0:cde9e8d8-b54b-4285-9f51-c2d49620b715
op=res_lvm_iscsivg01_stop_0 )
Jun 29 12:34:53 ss2 lrmd: [1092]: info: rsc:res_lvm_iscsivg01 stop[21]
(pid 2419)
Jun 29 12:34:53 ss2 LVM[2419]: INFO: Deactivating volume group storage
Jun 29 12:34:53 ss2 LVM[2419]: INFO: 0 logical volume(s) in volume
group "storage" now active
Jun 29 12:34:53 ss2 lrmd: [1092]: info: operation stop[21] on
res_lvm_iscsivg01 for client 1095: pid 2419 exited with return code 0
Jun 29 12:34:53 ss2 crmd: [1095]: info: process_lrm_event: LRM
operation res_lvm_iscsivg01_stop_0 (call=21, rc=0, cib-update=27,
confirmed=true) ok
Jun 29 12:37:33 ss2 attrd: [1093]: notice: attrd_trigger_update:
Sending flush op to all hosts for: fail-count-res_lvm_iscsivg01
(<null>)
Jun 29 12:37:33 ss2 attrd: [1093]: notice: attrd_perform_update: Sent
delete 51: node=ss2, attr=fail-count-res_lvm_iscsivg01, id=<n/a>,
set=(null), section=status
Jun 29 12:37:33 ss2 attrd: [1093]: notice: attrd_perform_update: Sent
delete 53: node=ss2, attr=fail-count-res_lvm_iscsivg01, id=<n/a>,
set=(null), section=status
Jun 29 12:37:33 ss2 crmd: [1095]: info: delete_resource: Removing
resource res_lvm_iscsivg01 for 3062_crm_resource (internal) on ss2
Jun 29 12:37:33 ss2 crmd: [1095]: info: notify_deleted: Notifying
3062_crm_resource on ss2 that res_lvm_iscsivg01 was deleted
Jun 29 12:37:33 ss2 crmd: [1095]: info: send_direct_ack: ACK'ing
resource op res_lvm_iscsivg01_delete_60000 from 0:0:crm-resource-3062:
lrm_invoke-lrmd-1340966253-22
Jun 29 12:37:33 ss2 attrd: [1093]: notice: attrd_trigger_update:
Sending flush op to all hosts for: last-failure-res_lvm_iscsivg01
(1340966093)
Jun 29 12:37:33 ss2 crmd: [1095]: info: do_lrm_rsc_op: Performing
key=7:25:7:cde9e8d8-b54b-4285-9f51-c2d49620b715
op=res_lvm_iscsivg01_monitor_0 )
Jun 29 12:37:33 ss2 lrmd: [1092]: info: rsc:res_lvm_iscsivg01
probe[28] (pid 3064)
Jun 29 12:37:33 ss2 LVM[3064]: INFO: LVM Volume storage is offline
Jun 29 12:37:33 ss2 lrmd: [1092]: info: operation monitor[28] on
res_lvm_iscsivg01 for client 1095: pid 3064 exited with return code 7
Jun 29 12:37:33 ss2 crmd: [1095]: info: process_lrm_event: LRM
operation res_lvm_iscsivg01_monitor_0 (call=28, rc=7, cib-update=35,
confirmed=true) not running
Jun 29 12:37:33 ss2 crmd: [1095]: info: do_lrm_rsc_op: Performing
key=39:28:0:cde9e8d8-b54b-4285-9f51-c2d49620b715
op=res_lvm_iscsivg01_start_0 )
Jun 29 12:37:33 ss2 lrmd: [1092]: info: rsc:res_lvm_iscsivg01
start[34] (pid 3212)
Jun 29 12:37:33 ss2 LVM[3212]: INFO: Activating volume group storage
Jun 29 12:37:33 ss2 LVM[3212]: INFO: Reading all physical volumes.
This may take a while... Found volume group "storage" using metadata
type lvm2 Found volume group "main" using metadata type lvm2
Jun 29 12:37:33 ss2 LVM[3212]: INFO: 0 logical volume(s) in volume
group "storage" now active
Jun 29 12:37:33 ss2 LVM[3212]: ERROR: LVM: storage did not activate correctly
Jun 29 12:37:33 ss2 lrmd: [1092]: info: operation start[34] on
res_lvm_iscsivg01 for client 1095: pid 3212 exited with return code 7
Jun 29 12:37:33 ss2 crmd: [1095]: info: process_lrm_event: LRM
operation res_lvm_iscsivg01_start_0 (call=34, rc=7, cib-update=38,
confirmed=true) not running
Jun 29 12:37:33 ss2 attrd: [1093]: notice: attrd_trigger_update:
Sending flush op to all hosts for: fail-count-res_lvm_iscsivg01
(INFINITY)
Jun 29 12:37:33 ss2 attrd: [1093]: notice: attrd_perform_update: Sent
update 66: fail-count-res_lvm_iscsivg01=INFINITY
Jun 29 12:37:33 ss2 attrd: [1093]: notice: attrd_trigger_update:
Sending flush op to all hosts for: last-failure-res_lvm_iscsivg01
(1340966253)
Jun 29 12:37:33 ss2 attrd: [1093]: notice: attrd_perform_update: Sent
update 68: last-failure-res_lvm_iscsivg01=1340966253
Jun 29 12:37:34 ss2 crmd: [1095]: info: do_lrm_rsc_op: Performing
key=2:30:0:cde9e8d8-b54b-4285-9f51-c2d49620b715
op=res_lvm_iscsivg01_stop_0 )
Jun 29 12:37:34 ss2 lrmd: [1092]: info: rsc:res_lvm_iscsivg01 stop[35]
(pid 3252)
Jun 29 12:37:34 ss2 LVM[3252]: INFO: Deactivating volume group storage
Jun 29 12:37:34 ss2 LVM[3252]: INFO: 0 logical volume(s) in volume
group "storage" now active
Jun 29 12:37:34 ss2 lrmd: [1092]: info: operation stop[35] on
res_lvm_iscsivg01 for client 1095: pid 3252 exited with return code 0
Jun 29 12:37:34 ss2 crmd: [1095]: info: process_lrm_event: LRM
operation res_lvm_iscsivg01_stop_0 (call=35, rc=0, cib-update=39,
confirmed=true) ok

DRBD status:
1:iscsivg01 Connected Primary/Secondary UpToDate/UpToDate C r-----

crm/configure:
primitive res_drbd_iscsivg01 ocf:linbit:drbd \
params drbd_resource="iscsivg01" \
op monitor interval="10"
primitive res_ip_iscsivg01 ocf:heartbeat:IPaddr2 \
params ip="10.10.10.251" cidr_netmask="24" \
op monitor interval="10s"
primitive res_lu_iscsivg01_lun1 ocf:heartbeat:iSCSILogicalUnit \
params target_iqn="iqn.2012-06.com.synaptel:storage.synaptel.iscsivg01"
lun="1" path="/dev/iscsivg01/lun1" \
op monitor interval="10s"
primitive res_lvm_iscsivg01 ocf:heartbeat:LVM \
params volgrpname="storage" \
op monitor interval="30s" \
meta target-role="Started" is-managed="true"
primitive res_target_iscsivg01 ocf:heartbeat:iSCSITarget \
params iqn="iqn.2012-06.com.synaptel:storage.synaptel.iscsivg01"
allowed_initiators="10.10.10.25 10.10.10.26" incoming_username="test"
incoming_password="password" tid="1" \
op monitor interval="10s"
group grp_iscsivg01 res_lvm_iscsivg01 res_target_iscsivg01
res_lu_iscsivg01_lun1 res_ip_iscsivg01
ms ms_drbd_iscsivg01 res_drbd_iscsivg01 \
meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true" is-managed="true"
target-role="Started"
colocation cln_iscsivg01_on_drbd inf: grp_iscsivg01 ms_drbd_iscsivg01:Master
order odr_drbd_before_iscsivg01 inf: ms_drbd_iscsivg01:promote
grp_iscsivg01:start

Do you have idea what did I wrong?


Thank you in advance!

Best regards,
KAMI
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


phil at macprofessionals

Jun 29, 2012, 8:23 AM

Post #2 of 7 (808 views)
Permalink
Re: Problem to use DRBD with LVM controlled by Pacemaker [In reply to]

On 06/29/2012 11:09 AM, KAMI911 KAMI911 wrote:
> Jun 29 12:34:53 ss2 LVM[2386]: INFO: 0 logical volume(s) in volume
> group "storage" now active

Looks like your problem is right there. The LVM RA considers starting
the VG as failed if the VG doesn't contain any LVs.


_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


lmb at suse

Jul 3, 2012, 7:17 AM

Post #3 of 7 (781 views)
Permalink
Re: Problem to use DRBD with LVM controlled by Pacemaker [In reply to]

On 2012-06-29T11:23:09, Phil Frost <phil [at] macprofessionals> wrote:

> >Jun 29 12:34:53 ss2 LVM[2386]: INFO: 0 logical volume(s) in volume
> >group "storage" now active
> Looks like your problem is right there. The LVM RA considers
> starting the VG as failed if the VG doesn't contain any LVs.

I really wish someone had a good idea how to improve this in the RA, by
the way. I still haven't found a good solution.


Regards,
Lars

--
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


phil at macprofessionals

Jul 3, 2012, 9:25 AM

Post #4 of 7 (784 views)
Permalink
Re: Problem to use DRBD with LVM controlled by Pacemaker [In reply to]

On Jul 3, 2012, at 10:17 AM, Lars Marowsky-Bree wrote:

> On 2012-06-29T11:23:09, Phil Frost <phil [at] macprofessionals> wrote:
>
>>> Jun 29 12:34:53 ss2 LVM[2386]: INFO: 0 logical volume(s) in volume
>>> group "storage" now active
>> Looks like your problem is right there. The LVM RA considers
>> starting the VG as failed if the VG doesn't contain any LVs.
>
> I really wish someone had a good idea how to improve this in the RA, by
> the way. I still haven't found a good solution.

After a cursory look at the LVM2 source code, I don't think it's possible with the current model. The function invoked by vgchange -a (y/n) seems to be _vgchange_available in tools/vgchange.c [1]. That function seems to be nothing but:

- error checking
- activating / deactivating each LV in the VG individually

So, there simply isn't anything in the volume group metadata that would allow the RA to query if it is active or not. Since the active bit is an attribute of the LVs, not the VG, the only reasonable solution would seem to be introducing management of the LVs into the RA. That might bring some advantages in some failure modes where VGs can be only partially activated, at the expense of a more verbose configuration in Pacemaker.

[1] I couldn't find anything that seems to be an official online version of the LVM2 source, but this works: ftp://ftp.gwdg.de/pub/linux/misc/lvm2/tools/old/LVM2.2.00.08/tools/vgchange.c
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


florian at hastexo

Jul 3, 2012, 10:33 AM

Post #5 of 7 (786 views)
Permalink
Re: Problem to use DRBD with LVM controlled by Pacemaker [In reply to]

On Tue, Jul 3, 2012 at 4:17 PM, Lars Marowsky-Bree <lmb [at] suse> wrote:
> On 2012-06-29T11:23:09, Phil Frost <phil [at] macprofessionals> wrote:
>
>> >Jun 29 12:34:53 ss2 LVM[2386]: INFO: 0 logical volume(s) in volume
>> >group "storage" now active
>> Looks like your problem is right there. The LVM RA considers
>> starting the VG as failed if the VG doesn't contain any LVs.
>
> I really wish someone had a good idea how to improve this in the RA, by
> the way. I still haven't found a good solution.

The very notion that we're "activating" a whole VG made Alasdair
cringe last time I talked to him, and rightly so, because it's not
meaningful as only LVs can really be "active" or "inactive". Of
course, LVM's own management tools do the same, and they of course
can't distinguish either between a VG that has been deactivated as a
whole, and a VG that has had all of its LVs deactivated one by one. So
I guess we're stuck. Unless someone wants to entertain the idea of
replacing this RA with one that manages individual LVs, which to me
sounds just as silly.

Florian
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


arnold at arnoldarts

Jul 3, 2012, 12:20 PM

Post #6 of 7 (782 views)
Permalink
Re: Problem to use DRBD with LVM controlled by Pacemaker [In reply to]

Hi,

maybe I do have some misunderstanding, but:

On 03.07.2012 19:33, Florian Haas wrote:
> The very notion that we're "activating" a whole VG made Alasdair
> cringe last time I talked to him, and rightly so, because it's not
> meaningful as only LVs can really be "active" or "inactive". Of
> course, LVM's own management tools do the same, and they of course
> can't distinguish either between a VG that has been deactivated as a
> whole, and a VG that has had all of its LVs deactivated one by one. So
> I guess we're stuck. Unless someone wants to entertain the idea of
> replacing this RA with one that manages individual LVs, which to me
> sounds just as silly.

Suppose I want to switch a vg from usage on machine A to machine B (via
underlying drbd/iscsi/whatever and without clvm/dual-primary). From my
understanding of lvm I would either have to deactivate caching of the
meta-data (sounds unclean) or export/import the vg. The later is the
same as I do when moving a physical disk with its vg between machines
while these are online.

So shouldn't the process inside the pacemaker stack be:
[up] import vg
[up] (if needed) activate the lvs in it
[up] mount the filesystems, start the iscsi-exports, start the virtual
machines
[Use the volumes]
[down] stop usage of the volumes (unmount, unexport, vm shutdown)
[down] (if needed) deactivate the lvs
[down] export the vg

So the RA wouldn't "activate" a vg but import/export it?

Have fun,

Arnold
--
Dieses Email wurde elektronisch erstellt und ist ohne handschriftliche
Unterschrift gültig.
Attachments: signature.asc (0.19 KB)


kami911 at gmail

Jul 25, 2012, 12:42 PM

Post #7 of 7 (647 views)
Permalink
Re: Problem to use DRBD with LVM controlled by Pacemaker [In reply to]

That is the point. It is OK now.
Sorry for the late reply.

KAMI

2012/6/29 Phil Frost <phil [at] macprofessionals>:
> On 06/29/2012 11:09 AM, KAMI911 KAMI911 wrote:
>>
>> Jun 29 12:34:53 ss2 LVM[2386]: INFO: 0 logical volume(s) in volume
>> group "storage" now active
>
>
> Looks like your problem is right there. The LVM RA considers starting the VG
> as failed if the VG doesn't contain any LVs.
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user [at] lists
> http://lists.linbit.com/mailman/listinfo/drbd-user
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user

DRBD users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.