Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Pacemaker

Error starting Apache on 2 nodes cluster

 

 

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded


angie.tawfik at gmail

Nov 18, 2009, 3:56 PM

Post #1 of 10 (1858 views)
Permalink
Error starting Apache on 2 nodes cluster

Hello
I'm a pacemaker and openais beginner.
I followed the document 'cluster from scratch' and I successfully managed to
create and monitor a 'ClusterIP' and 'LoadBalancer' resources.

But, Whenever I try to start Apache:
# crm configure primitive WebSite ocf:heartbeat:apache params
configfile=/etc/httpd/conf/httpd.conf op monitor interval=1min

whether using (ocf:heartbeat:apache) or (lsb::httpd) I get the following
errors when watching crm_mon:

============
Last updated: Thu Nov 19 01:38:33 2009
Stack: openais
Current DC: test1.localdomain - partition with quorum
Version: 1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7
2 Nodes configured, 2 expected votes
3 Resources configured.
============

Online: [ test1.localdomain test2.localdomain ]

ClusterIP (ocf::heartbeat:IPaddr2): Started test1.localdomain
LoadBalancer (lsb:haproxy): Started test1.localdomain

Failed actions:
WebSite_start_0 (node=test1.localdomain, call=9, rc=1, status=complete):
unknown error
WebSite_start_0 (node=test2.localdomain, call=5, rc=1, status=complete):
unknown error
/************************************************************************************************************/

Knowing that I am using:
CentOS 5.4..
openais-0.80.5-15.1
pacemaker-1.0.5-4.1
# chkconfig httpd off
server-status is not enabled in my httpd.conf ...

I always check apache processes before configuring my crm using:

# ps aux | grep httpd
/* to make sure there are no zombie processes */

# /etc/init.d/httpd status
/* to gurantee it's stopped and nothing is locked */

Last but not least I am ataching the *last 100 lines of my /var/log/messages
* of the 2nd node to help you help me.
I have been on this loop for four days now and I have no idea why the crm
can't start apache though when manually starting it, everything runs
smoothly!!!

Thank you in advance
--
All the best,
Angie
Attachments: errors_starting_apache_on_2nd_node (10.9 KB)


lbigum at iseek

Nov 18, 2009, 4:39 PM

Post #2 of 10 (1833 views)
Permalink
Re: Error starting Apache on 2 nodes cluster [In reply to]

Angie,

I can't tell exactly what's you've provided, can you post your CRM configuration (the output of 'crm configure show')? While you're at it, also provide ' crm_verify -LV' and 'crm_mon -fo1'.

This looks suspicious though:

Nov 19 01:25:08 test2 crmd: [24251]: info: process_lrm_event: LRM operation WebServer_monitor_60000 (call=483, rc=-2, cib-update=0, confirmed=true) Cancelled unknown exec error

Personally I'd start with the OCF RA and leave LSB:httpd alone. From the above error message, something inside lssb:httpd is returning -2, which is not a supported return code.

Depending on how confident you are with shell scripts, you might find it helpful to eliminate Pacemaker from the equation and call the Resource Agent script yourself to debug problems manually, like so...

Disable your resource so Pacemaker doesn't interfere:

crm_resource -r WebSite -m -p target-role -v stopped

Then move into the RA directory and set a necessary environment variable:

cd =/usr/lib/ocf/resource.d/heartbeat
export OCF_ROOT=/usr/lib/ocf

Start testing the apache RA, setting the only mandatory environment variable for ocf:heartbeat:apache :

export OCF_RESKEY_configfile=/path/to/your/main/apache/config
./apache start
echo $?

That should echo "0" for success. Judging by your logs, you can start Apache but the monitor is failing:

./apache monitor
echo $?

If that doesn't echo "0", you might get a helpful error message explaining what's wrong. You might have to read through the apache script itself to figure out why it's failing. Finally test the 'stop' operation:

./apache stop
echo $?

Should echo "0" as well. If this all works for you, but the resource in Pacemaker is still not working, then it's probably something in your CIB (like a bad attribute), as you've just done pretty much exactly what Pacemaker will do.

Let us know how you go.

Luke Bigum
Systems Administrator
(p) 1300 661 668
(f) 1300 661 540
(e) lbigum [at] iseek<mailto:lbigum [at] iseek>
http://www.iseek.com.au<http://www.iseek.com.au/>
Level 1, 100 Ipswich Road Woolloongabba QLD 4102

[cid:image001.jpg [at] 01CA6901]

This e-mail and any files transmitted with it may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorised to receive for the recipient), please contact the sender by reply e-mail and delete all copies of this message.


From: Angie T. Muhammad [mailto:angie.tawfik [at] gmail]
Sent: Thursday 19 November 2009 9:57 AM
To: pacemaker [at] oss
Subject: [Pacemaker] Error starting Apache on 2 nodes cluster

Hello
I'm a pacemaker and openais beginner.
I followed the document 'cluster from scratch' and I successfully managed to create and monitor a 'ClusterIP' and 'LoadBalancer' resources.

But, Whenever I try to start Apache:
# crm configure primitive WebSite ocf:heartbeat:apache params configfile=/etc/httpd/conf/httpd.conf op monitor interval=1min

whether using (ocf:heartbeat:apache) or (lsb::httpd) I get the following errors when watching crm_mon:

============
Last updated: Thu Nov 19 01:38:33 2009
Stack: openais
Current DC: test1.localdomain - partition with quorum
Version: 1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7
2 Nodes configured, 2 expected votes
3 Resources configured.
============

Online: [ test1.localdomain test2.localdomain ]

ClusterIP (ocf::heartbeat:IPaddr2): Started test1.localdomain
LoadBalancer (lsb:haproxy): Started test1.localdomain

Failed actions:
WebSite_start_0 (node=test1.localdomain, call=9, rc=1, status=complete): unknown error
WebSite_start_0 (node=test2.localdomain, call=5, rc=1, status=complete): unknown error
/************************************************************************************************************/

Knowing that I am using:
CentOS 5.4..
openais-0.80.5-15.1
pacemaker-1.0.5-4.1
# chkconfig httpd off
server-status is not enabled in my httpd.conf ...

I always check apache processes before configuring my crm using:

# ps aux | grep httpd
/* to make sure there are no zombie processes */

# /etc/init.d/httpd status
/* to gurantee it's stopped and nothing is locked */

Last but not least I am ataching the last 100 lines of my /var/log/messages of the 2nd node to help you help me.
I have been on this loop for four days now and I have no idea why the crm can't start apache though when manually starting it, everything runs smoothly!!!

Thank you in advance
--
All the best,
Angie
Attachments: image001.jpg (3.17 KB)


angie.tawfik at gmail

Nov 19, 2009, 2:35 AM

Post #3 of 10 (1888 views)
Permalink
Re: Error starting Apache on 2 nodes cluster [In reply to]

On Thu, Nov 19, 2009 at 2:39 AM, Luke Bigum <lbigum [at] iseek> wrote:

> Angie,
>
>
>
> I can't tell exactly what's you've provided, can you post your CRM
> configuration (the output of 'crm configure show')? While you're at it, also
> provide ' crm_verify -LV' and 'crm_mon -fo1'.
>
> Here are the outputs:
>
# crm configure show
node test1.localdomain
node test2.localdomain
primitive ClusterIP ocf:heartbeat:IPaddr2 \
params ip="10.0.0.102" cidr_netmask="255.255.255.0" \
op monitor interval="10s"
primitive LoadBalancer lsb:haproxy \
op monitor interval="10s"
primitive WebSite ocf:heartbeat:apache \
params configfile="/etc/httpd/conf/httpd.conf" \
op monitor interval="1min"
colocation LoadBalancer-with-ClusterIP inf: LoadBalancer ClusterIP
order LoadBalancer-after-ClusterIP inf: ClusterIP LoadBalancer
property $id="cib-bootstrap-options" \
stonith-enabled="false" \
expected-quorum-votes="2" \
dc-version="1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7" \
cluster-infrastructure="openais" \
no-quorum-policy="ignore"

# crm_verify -VL
crm_verify[14263]: 2009/11/19_12:22:57 WARN: unpack_rsc_op: Processing
failed op WebSite_start_0 on test1.localdomain: unknown error
crm_verify[14263]: 2009/11/19_12:22:57 WARN: unpack_rsc_op: Processing
failed op WebSite_start_0 on test2.localdomain: unknown error
crm_verify[14263]: 2009/11/19_12:22:57 WARN: common_apply_stickiness:
Forcing WebSite away from test1.localdomain after 1000000 failures
(max=1000000)
crm_verify[14263]: 2009/11/19_12:22:57 WARN: common_apply_stickiness:
Forcing WebSite away from test2.localdomain after 1000000 failures
(max=1000000)
crm_verify[14263]: 2009/11/19_12:22:57 WARN: native_color: Resource WebSite
cannot run anywhere
Warnings found during check: config may not be valid

# crm_mon -fo1
============
Last updated: Thu Nov 19 12:29:41 2009
Stack: openais
Current DC: test1.localdomain - partition with quorum
Version: 1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7
2 Nodes configured, 2 expected votes
3 Resources configured.
============

Online: [ test1.localdomain test2.localdomain ]

ClusterIP (ocf::heartbeat:IPaddr2): Started test1.localdomain
LoadBalancer (lsb:haproxy): Started test1.localdomain

Operations:
* Node test1.localdomain:
ClusterIP: migration-threshold=1000000
+ (4) start: rc=0 (ok)
+ (5) monitor: interval=10000ms rc=0 (ok)
LoadBalancer: migration-threshold=1000000
+ (6) start: rc=0 (ok)
+ (7) monitor: interval=10000ms rc=0 (ok)
WebSite: migration-threshold=1000000 fail-count=1000000
+ (9) start: rc=1 (unknown error)
+ (10) stop: rc=0 (ok)
* Node test2.localdomain:
WebSite: migration-threshold=1000000 fail-count=1000000
+ (5) start: rc=1 (unknown error)
+ (6) stop: rc=0 (ok)

Failed actions:
WebSite_start_0 (node=test1.localdomain, call=9, rc=1, status=complete):
unknown error
WebSite_start_0 (node=test2.localdomain, call=5, rc=1, status=complete):
unknown error

This looks suspicious though:
>
>
>
> Nov 19 01:25:08 test2 crmd: [24251]: info: process_lrm_event: LRM operation
> WebServer_monitor_60000 (call=483, rc=-2, cib-update=0, confirmed=true)
> Cancelled unknown exec error
>
>
>
> Personally I'd start with the OCF RA and leave LSB:httpd alone. From the
> above error message, something inside lssb:httpd is returning -2, which is
> not a supported return code.
>
>
>
> Depending on how confident you are with shell scripts, you might find it
> helpful to eliminate Pacemaker from the equation and call the Resource Agent
> script yourself to debug problems manually, like so...
>
> I'll be doing this and reporting you back.
>
> Disable your resource so Pacemaker doesn't interfere:
>
>
>
> crm_resource -r WebSite -m -p target-role -v stopped
>
>
>
> Then move into the RA directory and set a necessary environment variable:
>
>
>
> cd =/usr/lib/ocf/resource.d/heartbeat
>
> export OCF_ROOT=/usr/lib/ocf
>
>
>
> Start testing the apache RA, setting the only mandatory environment
> variable for ocf:heartbeat:apache :
>
>
>
> export OCF_RESKEY_configfile=/path/to/your/main/apache/config
>
> ./apache start
>
> echo $?
>
>
>
> That should echo "0" for success. Judging by your logs, you can start
> Apache but the monitor is failing:
>
>
>
> ./apache monitor
>
> echo $?
>
>
>
> If that doesn't echo "0", you might get a helpful error message explaining
> what's wrong. You might have to read through the apache script itself to
> figure out why it's failing. Finally test the 'stop' operation:
>
>
>
> ./apache stop
>
> echo $?
>
>
>
> Should echo "0" as well. If this all works for you, but the resource in
> Pacemaker is still not working, then it's probably something in your CIB
> (like a bad attribute), as you've just done pretty much exactly what
> Pacemaker will do.
>
>
>
> Let us know how you go.
>
Sure, I will. Thank you so much.

> Tod
>
> *Luke Bigum*
>
> *Systems Administrator*
>
> (p) 1300 661 668
>
> (f) 1300 661 540
>
> (e) lbigum [at] iseek
>
> http://www.iseek.com.au
>
> Level 1, 100 Ipswich Road Woolloongabba QLD 4102
>
>
>
> [image: iseekbar.jpg]
>
>
>
> This e-mail and any files transmitted with it may contain confidential and
> privileged material for the sole use of the intended recipient. Any review,
> use, distribution or disclosure by others is strictly prohibited. If you are
> not the intended recipient (or authorised to receive for the recipient),
> please contact the sender by reply e-mail and delete all copies of this
> message.
>
>
>
>
>
> *From:* Angie T. Muhammad [mailto:angie.tawfik [at] gmail]
> *Sent:* Thursday 19 November 2009 9:57 AM
> *To:* pacemaker [at] oss
> *Subject:* [Pacemaker] Error starting Apache on 2 nodes cluster
>
>
>
> Hello
> I'm a pacemaker and openais beginner.
> I followed the document 'cluster from scratch' and I successfully managed
> to create and monitor a 'ClusterIP' and 'LoadBalancer' resources.
>
> But, Whenever I try to start Apache:
> # crm configure primitive WebSite ocf:heartbeat:apache params
> configfile=/etc/httpd/conf/httpd.conf op monitor interval=1min
>
> whether using (ocf:heartbeat:apache) or (lsb::httpd) I get the following
> errors when watching crm_mon:
>
> ============
> Last updated: Thu Nov 19 01:38:33 2009
> Stack: openais
> Current DC: test1.localdomain - partition with quorum
> Version: 1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7
> 2 Nodes configured, 2 expected votes
> 3 Resources configured.
> ============
>
> Online: [ test1.localdomain test2.localdomain ]
>
> ClusterIP (ocf::heartbeat:IPaddr2): Started test1.localdomain
> LoadBalancer (lsb:haproxy): Started test1.localdomain
>
> Failed actions:
> WebSite_start_0 (node=test1.localdomain, call=9, rc=1,
> status=complete): unknown error
> WebSite_start_0 (node=test2.localdomain, call=5, rc=1,
> status=complete): unknown error
>
> /************************************************************************************************************/
>
> Knowing that I am using:
> CentOS 5.4..
> openais-0.80.5-15.1
> pacemaker-1.0.5-4.1
> # chkconfig httpd off
> server-status is not enabled in my httpd.conf ...
>
> I always check apache processes before configuring my crm using:
>
> # ps aux | grep httpd
> /* to make sure there are no zombie processes */
>
> # /etc/init.d/httpd status
> /* to gurantee it's stopped and nothing is locked */
>
> Last but not least I am ataching the *last 100 lines of my
> /var/log/messages* of the 2nd node to help you help me.
> I have been on this loop for four days now and I have no idea why the crm
> can't start apache though when manually starting it, everything runs
> smoothly!!!
>
> Thank you in advance
> --
> All the best,
> Angie
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>


--
All the best,
Angie
Attachments: image001.jpg (3.17 KB)


dejanmm at fastmail

Nov 19, 2009, 3:41 AM

Post #4 of 10 (1830 views)
Permalink
Re: Error starting Apache on 2 nodes cluster [In reply to]

Hi,

On Thu, Nov 19, 2009 at 12:35:36PM +0200, Angie T. Muhammad wrote:
> On Thu, Nov 19, 2009 at 2:39 AM, Luke Bigum <lbigum [at] iseek> wrote:
>
> > Angie,
> >
> >
> >
> > I can't tell exactly what's you've provided, can you post your CRM
> > configuration (the output of 'crm configure show')? While you're at it, also
> > provide ' crm_verify -LV' and 'crm_mon -fo1'.
> >
> > Here are the outputs:
> >
> # crm configure show
> node test1.localdomain
> node test2.localdomain
> primitive ClusterIP ocf:heartbeat:IPaddr2 \
> params ip="10.0.0.102" cidr_netmask="255.255.255.0" \
> op monitor interval="10s"
> primitive LoadBalancer lsb:haproxy \
> op monitor interval="10s"
> primitive WebSite ocf:heartbeat:apache \
> params configfile="/etc/httpd/conf/httpd.conf" \
> op monitor interval="1min"
> colocation LoadBalancer-with-ClusterIP inf: LoadBalancer ClusterIP
> order LoadBalancer-after-ClusterIP inf: ClusterIP LoadBalancer
> property $id="cib-bootstrap-options" \
> stonith-enabled="false" \
> expected-quorum-votes="2" \
> dc-version="1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7" \
> cluster-infrastructure="openais" \
> no-quorum-policy="ignore"
>
> # crm_verify -VL
> crm_verify[14263]: 2009/11/19_12:22:57 WARN: unpack_rsc_op: Processing
> failed op WebSite_start_0 on test1.localdomain: unknown error
> crm_verify[14263]: 2009/11/19_12:22:57 WARN: unpack_rsc_op: Processing
> failed op WebSite_start_0 on test2.localdomain: unknown error

To find out why the resource failed, grep your logs for
lrmd.*WebSite. It should show you errors which were logged by the
apache resource agent.

Thanks,

Dejan

_______________________________________________
Pacemaker mailing list
Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


angie.tawfik at gmail

Nov 19, 2009, 5:16 AM

Post #5 of 10 (1857 views)
Permalink
Re: Error starting Apache on 2 nodes cluster [In reply to]

On Thu, Nov 19, 2009 at 1:41 PM, Dejan Muhamedagic <dejanmm [at] fastmail>wrote:

> Hi,
>
> On Thu, Nov 19, 2009 at 12:35:36PM +0200, Angie T. Muhammad wrote:
> > On Thu, Nov 19, 2009 at 2:39 AM, Luke Bigum <lbigum [at] iseek> wrote:
> >
> > > Angie,
> > >
> > >
> > >
> > > I can't tell exactly what's you've provided, can you post your CRM
> > > configuration (the output of 'crm configure show')? While you're at it,
> also
> > > provide ' crm_verify -LV' and 'crm_mon -fo1'.
> > >
> > > Here are the outputs:
> > >
> > # crm configure show
> > node test1.localdomain
> > node test2.localdomain
> > primitive ClusterIP ocf:heartbeat:IPaddr2 \
> > params ip="10.0.0.102" cidr_netmask="255.255.255.0" \
> > op monitor interval="10s"
> > primitive LoadBalancer lsb:haproxy \
> > op monitor interval="10s"
> > primitive WebSite ocf:heartbeat:apache \
> > params configfile="/etc/httpd/conf/httpd.conf" \
> > op monitor interval="1min"
> > colocation LoadBalancer-with-ClusterIP inf: LoadBalancer ClusterIP
> > order LoadBalancer-after-ClusterIP inf: ClusterIP LoadBalancer
> > property $id="cib-bootstrap-options" \
> > stonith-enabled="false" \
> > expected-quorum-votes="2" \
> > dc-version="1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7" \
> > cluster-infrastructure="openais" \
> > no-quorum-policy="ignore"
> >
> > # crm_verify -VL
> > crm_verify[14263]: 2009/11/19_12:22:57 WARN: unpack_rsc_op: Processing
> > failed op WebSite_start_0 on test1.localdomain: unknown error
> > crm_verify[14263]: 2009/11/19_12:22:57 WARN: unpack_rsc_op: Processing
> > failed op WebSite_start_0 on test2.localdomain: unknown error
>
> To find out why the resource failed, grep your logs for
> lrmd.*WebSite. It should show you errors which were logged by the
> apache resource agent.
>

On Node 1:
# cat /var/log/messages | grep -i lrmd.*WebSite
Nov 19 14:55:49 test1 lrmd: [19644]: info: rsc:WebSite:4: monitor
Nov 19 14:55:49 test1 lrmd: [19644]: info: RA output:
(WebSite:monitor:stderr) logd is not running
Nov 19 14:55:49 test1 lrmd: [19644]: info: RA output:
(WebSite:monitor:stderr) 2009/11/19_14:55:49 INFO: apache not running
Nov 19 14:55:53 test1 lrmd: [19644]: info: rsc:WebSite:9: start
Nov 19 14:55:54 test1 lrmd: [19644]: info: RA output: (WebSite:start:stderr)
logd is not running
Nov 19 14:55:54 test1 lrmd: [19644]: info: RA output: (WebSite:start:stderr)
2009/11/19_14:55:54 INFO: apache not running
Nov 19 14:55:54 test1 lrmd: [19644]: info: RA output: (WebSite:start:stderr)
logd is not running
Nov 19 14:55:54 test1 lrmd: [19644]: info: RA output: (WebSite:start:stderr)
2009/11/19_14:55:54 INFO: waiting for apache /etc/httpd/conf/httpd.conf to
come up
Nov 19 14:55:55 test1 lrmd: [19644]: info: RA output: (WebSite:start:stderr)
logd is not running
Nov 19 14:55:55 test1 lrmd: [19644]: info: RA output: (WebSite:start:stderr)
2009/11/19_14:55:55 ERROR: command failed: sh -c wget -O- -q -L --no-proxy
--bind-address=127.0.0.1 http://10.0.0.100:80 | tr '\012' ' ' | grep -Ei "</
*body *>[[:space:]]*</ *html *>" >/dev/null
Nov 19 14:55:56 test1 lrmd: [19644]: info: rsc:WebSite:10: stop
Nov 19 14:55:57 test1 lrmd: [19644]: info: RA output: (WebSite:stop:stderr)
logd is not running
Nov 19 14:55:57 test1 lrmd: [19644]: info: RA output: (WebSite:stop:stderr)
2009/11/19_14:55:57 INFO: Killing apache PID 19880
Nov 19 14:55:57 test1 lrmd: [19644]: info: RA output: (WebSite:stop:stderr)
logd is not running
Nov 19 14:55:57 test1 lrmd: [19644]: info: RA output: (WebSite:stop:stderr)
2009/11/19_14:55:57 INFO: apache stopped.

On Node 2:
# cat /var/log/messages | grep -i lrmd.*WebSite
Nov 19 14:55:41 test2 lrmd: [31978]: info: rsc:WebSite:4: monitor
Nov 19 14:55:41 test2 lrmd: [31978]: info: RA output:
(WebSite:monitor:stderr) logd is not running
Nov 19 14:55:41 test2 lrmd: [31978]: info: RA output:
(WebSite:monitor:stderr) 2009/11/19_14:55:41 INFO: apache not running
Nov 19 14:55:43 test2 lrmd: [31978]: info: rsc:WebSite:5: start
Nov 19 14:55:43 test2 lrmd: [31978]: info: RA output: (WebSite:start:stderr)
logd is not running
Nov 19 14:55:43 test2 lrmd: [31978]: info: RA output: (WebSite:start:stderr)
2009/11/19_14:55:43 INFO: apache not running
Nov 19 14:55:43 test2 lrmd: [31978]: info: RA output: (WebSite:start:stderr)
logd is not running
Nov 19 14:55:43 test2 lrmd: [31978]: info: RA output: (WebSite:start:stderr)
2009/11/19_14:55:43 INFO: waiting for apache /etc/httpd/conf/httpd.conf to
come up
Nov 19 14:55:44 test2 lrmd: [31978]: info: RA output: (WebSite:start:stderr)
logd is not running
Nov 19 14:55:44 test2 lrmd: [31978]: info: RA output: (WebSite:start:stderr)
2009/11/19_14:55:44 ERROR: command failed: sh -c wget -O- -q -L --no-proxy
--bind-address=127.0.0.1 http://10.0.0.101:80 | tr '\012' ' ' | grep -Ei "</
*body *>[[:space:]]*</ *html *>" >/dev/null
Nov 19 14:55:45 test2 lrmd: [31978]: info: rsc:WebSite:6: stop
Nov 19 14:55:46 test2 lrmd: [31978]: info: RA output: (WebSite:stop:stderr)
logd is not running
Nov 19 14:55:46 test2 lrmd: [31978]: info: RA output: (WebSite:stop:stderr)
2009/11/19_14:55:46 INFO: Killing apache PID 32115
Nov 19 14:55:46 test2 lrmd: [31978]: info: RA output: (WebSite:stop:stderr)
logd is not running
Nov 19 14:55:46 test2 lrmd: [31978]: info: RA output: (WebSite:stop:stderr)
2009/11/19_14:55:46 INFO: apache stopped.

>
> Thanks,
>

Thank you Dejan

>
> Dejan
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>



--
All the best,
Angie


r.bhatia at ipax

Nov 19, 2009, 6:19 AM

Post #6 of 10 (1846 views)
Permalink
Re: Error starting Apache on 2 nodes cluster [In reply to]

On 11/19/2009 02:16 PM, Angie T. Muhammad wrote:
> On Node 1:
> # cat /var/log/messages | grep -i lrmd.*WebSite
> Nov 19 14:55:49 test1 lrmd: [19644]: info: rsc:WebSite:4: monitor
> Nov 19 14:55:49 test1 lrmd: [19644]: info: RA output:
> (WebSite:monitor:stderr) logd is not running
> Nov 19 14:55:49 test1 lrmd: [19644]: info: RA output:
> (WebSite:monitor:stderr) 2009/11/19_14:55:49 INFO: apache not running
> Nov 19 14:55:53 test1 lrmd: [19644]: info: rsc:WebSite:9: start
> Nov 19 14:55:54 test1 lrmd: [19644]: info: RA output:
> (WebSite:start:stderr) logd is not running
> Nov 19 14:55:54 test1 lrmd: [19644]: info: RA output:
> (WebSite:start:stderr) 2009/11/19_14:55:54 INFO: apache not running
> Nov 19 14:55:54 test1 lrmd: [19644]: info: RA output:
> (WebSite:start:stderr) logd is not running
> Nov 19 14:55:54 test1 lrmd: [19644]: info: RA output:
> (WebSite:start:stderr) 2009/11/19_14:55:54 INFO: waiting for apache
> /etc/httpd/conf/httpd.conf to come up
> Nov 19 14:55:55 test1 lrmd: [19644]: info: RA output:
> (WebSite:start:stderr) logd is not running
> Nov 19 14:55:55 test1 lrmd: [19644]: info: RA output:
> (WebSite:start:stderr) 2009/11/19_14:55:55 ERROR: command failed: sh -c
> wget -O- -q -L --no-proxy --bind-address=127.0.0.1 http://10.0.0.100:80
> | tr '\012' ' ' | grep -Ei "</ *body *>[[:space:]]*</ *html *>" >/dev/null

what happens if you manually issue the wget command?

> Nov 19 14:55:56 test1 lrmd: [19644]: info: rsc:WebSite:10: stop
> Nov 19 14:55:57 test1 lrmd: [19644]: info: RA output:
> (WebSite:stop:stderr) logd is not running
> Nov 19 14:55:57 test1 lrmd: [19644]: info: RA output:
> (WebSite:stop:stderr) 2009/11/19_14:55:57 INFO: Killing apache PID 19880
> Nov 19 14:55:57 test1 lrmd: [19644]: info: RA output:
> (WebSite:stop:stderr) logd is not running
> Nov 19 14:55:57 test1 lrmd: [19644]: info: RA output:
> (WebSite:stop:stderr) 2009/11/19_14:55:57 INFO: apache stopped.

cheers,
raoul
--
____________________________________________________________________
DI (FH) Raoul Bhatia M.Sc. email. r.bhatia [at] ipax
Technischer Leiter

IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at
Barawitzkagasse 10/2/2/11 email. office [at] ipax
1190 Wien tel. +43 1 3670030
FN 277995t HG Wien fax. +43 1 3670030 15
____________________________________________________________________

_______________________________________________
Pacemaker mailing list
Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


infos at e-blokos

Nov 19, 2009, 7:27 AM

Post #7 of 10 (1832 views)
Permalink
Re: Error starting Apache on 2 nodes cluster [In reply to]

----- Original Message -----
From: "Dejan Muhamedagic" <dejanmm [at] fastmail>
To: <pacemaker [at] clusterlabs>
Sent: Thursday, November 19, 2009 6:41 AM
Subject: Re: [Pacemaker] Error starting Apache on 2 nodes cluster


> Hi,
>
> On Thu, Nov 19, 2009 at 12:35:36PM +0200, Angie T. Muhammad wrote:
>> On Thu, Nov 19, 2009 at 2:39 AM, Luke Bigum <lbigum [at] iseek> wrote:
>>
>> > Angie,
>> >
>> >
>> >
>> > I can't tell exactly what's you've provided, can you post your CRM
>> > configuration (the output of 'crm configure show')? While you're at it,
>> > also
>> > provide ' crm_verify -LV' and 'crm_mon -fo1'.
>> >
>> > Here are the outputs:
>> >
>> # crm configure show
>> node test1.localdomain
>> node test2.localdomain
>> primitive ClusterIP ocf:heartbeat:IPaddr2 \
>> params ip="10.0.0.102" cidr_netmask="255.255.255.0" \
>> op monitor interval="10s"
>> primitive LoadBalancer lsb:haproxy \
>> op monitor interval="10s"
>> primitive WebSite ocf:heartbeat:apache \
>> params configfile="/etc/httpd/conf/httpd.conf" \
>> op monitor interval="1min"
>> colocation LoadBalancer-with-ClusterIP inf: LoadBalancer ClusterIP
>> order LoadBalancer-after-ClusterIP inf: ClusterIP LoadBalancer
>> property $id="cib-bootstrap-options" \
>> stonith-enabled="false" \
>> expected-quorum-votes="2" \
>> dc-version="1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7" \
>> cluster-infrastructure="openais" \
>> no-quorum-policy="ignore"
>>
>> # crm_verify -VL
>> crm_verify[14263]: 2009/11/19_12:22:57 WARN: unpack_rsc_op: Processing
>> failed op WebSite_start_0 on test1.localdomain: unknown error
>> crm_verify[14263]: 2009/11/19_12:22:57 WARN: unpack_rsc_op: Processing
>> failed op WebSite_start_0 on test2.localdomain: unknown error
>
> To find out why the resource failed, grep your logs for
> lrmd.*WebSite. It should show you errors which were logged by the
> apache resource agent.
>
> Thanks,
>
> Dejan
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker

You need to have a apache vhost with local ip because Pacemaker ping on
it...


_______________________________________________
Pacemaker mailing list
Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


gdimilia at cfa

Nov 19, 2009, 12:59 PM

Post #8 of 10 (1831 views)
Permalink
Re: Error starting Apache on 2 nodes cluster [In reply to]

Hi,
I had the same problem with the same configuration.

I solved whit the following settings in httpd.conf:

1- Setting
ExtendedStatus On

2- Enable the server-status
<Location /server-status>
SetHandler server-status
Order deny,allow
Deny from all
Allow from 127.0.0.1
</Location>

if you need I can send you a complete working httpd.conf

Giovanni




On Nov 18, 2009, at 6:56 PM, Angie T. Muhammad wrote:

> Hello
> I'm a pacemaker and openais beginner.
> I followed the document 'cluster from scratch' and I successfully
> managed to create and monitor a 'ClusterIP' and 'LoadBalancer'
> resources.
>
> But, Whenever I try to start Apache:
> # crm configure primitive WebSite ocf:heartbeat:apache params
> configfile=/etc/httpd/conf/httpd.conf op monitor interval=1min
>
> whether using (ocf:heartbeat:apache) or (lsb::httpd) I get the
> following errors when watching crm_mon:
>
> ============
> Last updated: Thu Nov 19 01:38:33 2009
> Stack: openais
> Current DC: test1.localdomain - partition with quorum
> Version: 1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7
> 2 Nodes configured, 2 expected votes
> 3 Resources configured.
> ============
>
> Online: [ test1.localdomain test2.localdomain ]
>
> ClusterIP (ocf::heartbeat:IPaddr2): Started
> test1.localdomain
> LoadBalancer (lsb:haproxy): Started test1.localdomain
>
> Failed actions:
> WebSite_start_0 (node=test1.localdomain, call=9, rc=1,
> status=complete): unknown error
> WebSite_start_0 (node=test2.localdomain, call=5, rc=1,
> status=complete): unknown error
> /
> ************************************************************************************************************/
>
> Knowing that I am using:
> CentOS 5.4..
> openais-0.80.5-15.1
> pacemaker-1.0.5-4.1
> # chkconfig httpd off
> server-status is not enabled in my httpd.conf ...
>
> I always check apache processes before configuring my crm using:
>
> # ps aux | grep httpd
> /* to make sure there are no zombie processes */
>
> # /etc/init.d/httpd status
> /* to gurantee it's stopped and nothing is locked */
>
> Last but not least I am ataching the last 100 lines of my /var/log/
> messages of the 2nd node to help you help me.
> I have been on this loop for four days now and I have no idea why
> the crm can't start apache though when manually starting it,
> everything runs smoothly!!!
>
> Thank you in advance
> --
> All the best,
> Angie
> <
> errors_starting_apache_on_2nd_node
> >_______________________________________________
> Pacemaker mailing list
> Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker


angie.tawfik at gmail

Nov 19, 2009, 1:47 PM

Post #9 of 10 (1824 views)
Permalink
Re: Error starting Apache on 2 nodes cluster [In reply to]

I had the same thought when I saw the wget error, but I am not entitled to
add/remove virtual hosts to/from httpd.conf.
A colleague is supposed to handle this, but I think this should fix the
problem.
Thank you :)

On Thu, Nov 19, 2009 at 5:27 PM, E-Blokos <infos [at] e-blokos> wrote:

>
> ----- Original Message ----- From: "Dejan Muhamedagic" <
> dejanmm [at] fastmail>
> To: <pacemaker [at] clusterlabs>
> Sent: Thursday, November 19, 2009 6:41 AM
> Subject: Re: [Pacemaker] Error starting Apache on 2 nodes cluster
>
>
>
> Hi,
>>
>> On Thu, Nov 19, 2009 at 12:35:36PM +0200, Angie T. Muhammad wrote:
>>
>>> On Thu, Nov 19, 2009 at 2:39 AM, Luke Bigum <lbigum [at] iseek> wrote:
>>>
>>> > Angie,
>>> >
>>> >
>>> >
>>> > I can't tell exactly what's you've provided, can you post your CRM
>>> > configuration (the output of 'crm configure show')? While you're at it,
>>> > also
>>> > provide ' crm_verify -LV' and 'crm_mon -fo1'.
>>> >
>>> > Here are the outputs:
>>> >
>>> # crm configure show
>>> node test1.localdomain
>>> node test2.localdomain
>>> primitive ClusterIP ocf:heartbeat:IPaddr2 \
>>> params ip="10.0.0.102" cidr_netmask="255.255.255.0" \
>>> op monitor interval="10s"
>>> primitive LoadBalancer lsb:haproxy \
>>> op monitor interval="10s"
>>> primitive WebSite ocf:heartbeat:apache \
>>> params configfile="/etc/httpd/conf/httpd.conf" \
>>> op monitor interval="1min"
>>> colocation LoadBalancer-with-ClusterIP inf: LoadBalancer ClusterIP
>>> order LoadBalancer-after-ClusterIP inf: ClusterIP LoadBalancer
>>> property $id="cib-bootstrap-options" \
>>> stonith-enabled="false" \
>>> expected-quorum-votes="2" \
>>> dc-version="1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7" \
>>> cluster-infrastructure="openais" \
>>> no-quorum-policy="ignore"
>>>
>>> # crm_verify -VL
>>> crm_verify[14263]: 2009/11/19_12:22:57 WARN: unpack_rsc_op: Processing
>>> failed op WebSite_start_0 on test1.localdomain: unknown error
>>> crm_verify[14263]: 2009/11/19_12:22:57 WARN: unpack_rsc_op: Processing
>>> failed op WebSite_start_0 on test2.localdomain: unknown error
>>>
>>
>> To find out why the resource failed, grep your logs for
>> lrmd.*WebSite. It should show you errors which were logged by the
>> apache resource agent.
>>
>> Thanks,
>>
>> Dejan
>>
>> _______________________________________________
>> Pacemaker mailing list
>> Pacemaker [at] oss
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>
> You need to have a apache vhost with local ip because Pacemaker ping on
> it...
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>



--
All the best,
Angie


angie.tawfik at gmail

Nov 19, 2009, 1:50 PM

Post #10 of 10 (1824 views)
Permalink
Re: Error starting Apache on 2 nodes cluster [In reply to]

Hey Giovanni!
Thank you so much!
This eliminated the need to configure vhosts or do any other thing at
httpd.conf .. You saved me lots of time and effort..
It is now solved :D

Thanks again :)

On Thu, Nov 19, 2009 at 10:59 PM, Giovanni Di Milia <
gdimilia [at] cfa> wrote:

>
> Hi,
> I had the same problem with the same configuration.
>
> I solved whit the following settings in httpd.conf:
>
> 1- Setting
> ExtendedStatus On
>
> 2- Enable the server-status
> <Location /server-status>
> SetHandler server-status
> Order deny,allow
> Deny from all
> Allow from 127.0.0.1
> </Location>
>
> if you need I can send you a complete working httpd.conf
>
> Giovanni
>
>
>
>
> On Nov 18, 2009, at 6:56 PM, Angie T. Muhammad wrote:
>
> Hello
> I'm a pacemaker and openais beginner.
> I followed the document 'cluster from scratch' and I successfully managed
> to create and monitor a 'ClusterIP' and 'LoadBalancer' resources.
>
> But, Whenever I try to start Apache:
> # crm configure primitive WebSite ocf:heartbeat:apache params
> configfile=/etc/httpd/conf/httpd.conf op monitor interval=1min
>
> whether using (ocf:heartbeat:apache) or (lsb::httpd) I get the following
> errors when watching crm_mon:
>
> ============
> Last updated: Thu Nov 19 01:38:33 2009
> Stack: openais
> Current DC: test1.localdomain - partition with quorum
> Version: 1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7
> 2 Nodes configured, 2 expected votes
> 3 Resources configured.
> ============
>
> Online: [ test1.localdomain test2.localdomain ]
>
> ClusterIP (ocf::heartbeat:IPaddr2): Started test1.localdomain
> LoadBalancer (lsb:haproxy): Started test1.localdomain
>
> Failed actions:
> WebSite_start_0 (node=test1.localdomain, call=9, rc=1,
> status=complete): unknown error
> WebSite_start_0 (node=test2.localdomain, call=5, rc=1,
> status=complete): unknown error
>
> /************************************************************************************************************/
>
> Knowing that I am using:
> CentOS 5.4..
> openais-0.80.5-15.1
> pacemaker-1.0.5-4.1
> # chkconfig httpd off
> server-status is not enabled in my httpd.conf ...
>
> I always check apache processes before configuring my crm using:
>
> # ps aux | grep httpd
> /* to make sure there are no zombie processes */
>
> # /etc/init.d/httpd status
> /* to gurantee it's stopped and nothing is locked */
>
> Last but not least I am ataching the *last 100 lines of my
> /var/log/messages* of the 2nd node to help you help me.
> I have been on this loop for four days now and I have no idea why the crm
> can't start apache though when manually starting it, everything runs
> smoothly!!!
>
> Thank you in advance
> --
> All the best,
> Angie
> <errors_starting_apache_on_2nd_node>
> _______________________________________________
>
> Pacemaker mailing list
> Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>


--
All the best,
Angie

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.