Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Users
resource unmanaged/failed
 

Index | Next | Previous | View Flat


aleksey.kashin at gmail

Dec 7, 2011, 2:56 AM


Views: 1289
Permalink
resource unmanaged/failed

Hello.

I have two servers (radius1, radius2). I've set up the cluster resource
- IPaddr2. I used next commands to set up this resource:

# crm configure property stonith-enabled="false"
# crm configure property no-quorum-policy="ignore"
# crm configure primitive raddb_ip ocf:heartbeat:IPaddr2 params
ip="10.99.2.57" cidr_netmask="32" op monitor interval="15s"
# crm configure group raddb raddb_ip
# crm configure location raddb-prefers-radius1 raddb inf: radius1
# crm configure rsc_defaults resource-stickiness=1000001

All ok.

But sometimes on server radius1 the load is increasing and server is
swapping and at that moment resource becomes "(unmanaged) FAILED". Below
I've presented example "unmanaged" resource:

# crm_mon
============
Last updated: Wed Dec 7 14:56:20 2011
Stack: openais
Current DC: radius1 - partition with quorum
Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
============

Online: [ radius2 radius1 ]

Resource Group: raddb
raddb_ip (ocf::heartbeat:IPaddr2): Started radius1
(unmanaged) FAILED

Failed actions:
raddb_ip_monitor_15000 (node=radius1, call=4, rc=-2, status=Timed
Out): unknown exec error
raddb_ip_stop_0 (node=radius1, call=5, rc=-2, status=Timed Out):
unknown exec error


I've presented part of /var/log/syslog (radius1) here -
http://paste.org/41963


In that moment ip address 10.99.2.57 is alive and server responds to
requests coming to this ip. However sometimes this resource becomes
completely unavailable and I restart corosync. It's very bad.

I think resource becomes unmanaged because server is using swap and part
of corosync processes is in swap. I tested this suggestion and when
server is using a lot of swap resource becomes "unmanaged".

I use debian gnu/linux 5.x and this packages -
http://people.debian.org/~madkiss/ha/:

# dpkg -l |grep cluster
ii cluster-glue
1.0.7+hg2618-2~bpo50+1 The reusable cluster components for Linux HA
ii corosync
1.4.2-1~bpo50+1 Standards-based cluster framework (daemon an
ii libcluster-glue
1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries (transitional pac
ii libcorosync4
1.4.2-1~bpo50+1 Standards-based cluster framework (libraries
ii libcrmcluster1
1.1.5-3~bpo50+1 Pacemaker libraries - CRM
ii liblrm2
1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries -- liblrm2
ii libpils2
1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries -- libpils2
ii libplumb2
1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries -- libplumb2
ii libplumbgpl2
1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries -- libplumbgpl2
ii libstonith1
1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries -- libstonith1
ii pacemaker
1.1.5-3~bpo50+1 HA cluster resource manager



I can't increase ram on this servers. How can I do that resource isn't
becomes "unmanaged/failed" ?


With Best Regards.
Aleksey V. Kashin
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Subject User Time
resource unmanaged/failed aleksey.kashin at gmail Dec 7, 2011, 2:56 AM
    Re: resource unmanaged/failed dejanmm at fastmail Dec 8, 2011, 8:32 AM
    Re: resource unmanaged/failed andrew at beekhof Dec 8, 2011, 3:25 PM
    resource unmanaged/failed aleksey.kashin at gmail Dec 9, 2011, 12:46 AM
    Re: resource unmanaged/failed andrew at beekhof Dec 11, 2011, 3:34 PM
    Re: resource unmanaged/failed aleksey.kashin at gmail Dec 12, 2011, 2:42 AM

  Index | Next | Previous | View Flat
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.