Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Users

No nodes appear

 

 

Linux-HA users RSS feed   Index | Next | Previous | View Threaded


strzol at gmail

Oct 11, 2009, 11:34 PM

Post #1 of 8 (1377 views)
Permalink
No nodes appear

Hello to the list.

Please excuse my ignorance, because it is the first time i try to built a
cluster.

I'm trying to built a 2 node Active/Passive cluster with
DRBD+Pacemaker+Openais.

I'm on the very beginning and i try to achieve initial communication between
the nodes.

I'm getting the following:

crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: No STONITH
resources have been defined
crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: Either
configure some or disable STONITH with the stonith-enabled option
crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: NOTE: Clusters
with shared data need STONITH to ensure data integrity


============
Last updated: Mon Oct 12 09:22:36 2009
Current DC: NONE
0 Nodes configured, unknown expected votes
0 Resources configured.
============

I don't care for the first three messages, because i haven't configure
anything yet, but it seems that i don't have communication between the
nodes. There is no any firewall and the communication is on a dedicated LAN.

My openais.conf (identical for the two systems) is:

alpha:/etc/ais # crm_mon --one-shot -V
# Please read the openais.conf.5 manual page

aisexec {
# Run as root - this is necessary to be able to manage resources
with Pacemaker
user: root
group: root
}

service {
# Load the Pacemaker Cluster Resource Manager
ver: 0
name: pacemaker
use_mgmtd: yes
use_logd: yes
}

totem {
version: 2

# How long before declaring a token lost (ms)
token: 5000

# How many token retransmits before forming a new configuration
token_retransmits_before_loss_const: 10

# How long to wait for join messages in the membership protocol (ms)
join: 1000

# How long to wait for consensus to be achieved before starting a
new round of membership conf$
consensus: 2500

# Turn off the virtual synchrony filter
vsftype: none

# Number of messages that may be sent by one processor on receipt of
the token
max_messages: 20

# Stagger sending the node join messages by 1..send_join ms
send_join: 45

# Limit generated nodeids to 31-bits (positive signed integers)
clear_node_high_bit: yes

# Disable encryption
secauth: off

# How many threads to use for encryption/decryption
threads: 0

# Optionally assign a fixed node id (integer)
# nodeid: 1234

interface {
ringnumber: 0

# The following values need to be set based on your
environment
bindnetaddr: 192.168.67.0
mcastaddr: 226.94.1.1
mcastport: 5405
}

logging {
debug: off
fileline: off
to_syslog: yes
to_stderr: off
syslog_facility: daemon
timestamp: on
}

amf {
mode: disabled
}


The first node is on 192.168.67.10 and the second on 192.168.67.11.

Am i missing something?

Thank you in advance and please forgive my lack of knowledge.

Stratos.
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


andrew at beekhof

Oct 12, 2009, 12:16 AM

Post #2 of 8 (1307 views)
Permalink
Re: No nodes appear [In reply to]

On Mon, Oct 12, 2009 at 8:34 AM, Stratos Zolotas <strzol [at] gmail> wrote:
> Hello to the list.
>
> Please excuse my ignorance, because it is the first time i try to built a
> cluster.
>
> I'm trying to built a 2 node Active/Passive cluster with
> DRBD+Pacemaker+Openais.
>
> I'm on the very beginning and i try to achieve initial communication between
> the nodes.
>
> I'm getting the following:
>
> crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: No STONITH
> resources have been defined
> crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: Either
> configure some or disable STONITH with the stonith-enabled option
> crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: NOTE: Clusters
> with shared data need STONITH to ensure data integrity
>
>
> ============
> Last updated: Mon Oct 12 09:22:36 2009
> Current DC: NONE
> 0 Nodes configured, unknown expected votes
> 0 Resources configured.
> ============

You might just need to wait a bit longer.
But its hard to say without seeing all the logs. If you attach them
(compressed) we'll be able to help further.

> I don't care for the first three messages, because i haven't configure
> anything yet, but it seems that i don't have communication between the
> nodes. There is no any firewall and the communication is on a dedicated LAN.
>
> My openais.conf (identical for the two systems) is:
>
> alpha:/etc/ais # crm_mon --one-shot -V
> # Please read the openais.conf.5 manual page
>
> aisexec {
>        # Run as root - this is necessary to be able to manage resources
> with Pacemaker
>        user:   root
>        group:  root
> }
>
> service {
>        # Load the Pacemaker Cluster Resource Manager
>        ver:       0
>        name:      pacemaker
>        use_mgmtd: yes
>        use_logd:  yes
> }
>
> totem {
>        version: 2
>
>        # How long before declaring a token lost (ms)
>        token:          5000
>
>        # How many token retransmits before forming a new configuration
>        token_retransmits_before_loss_const: 10
>
>        # How long to wait for join messages in the membership protocol (ms)
>        join:           1000
>
>        # How long to wait for consensus to be achieved before starting a
> new round of membership conf$
>        consensus:      2500
>
> # Turn off the virtual synchrony filter
>        vsftype:        none
>
>        # Number of messages that may be sent by one processor on receipt of
> the token
>        max_messages:   20
>
>        # Stagger sending the node join messages by 1..send_join ms
>        send_join: 45
>
>        # Limit generated nodeids to 31-bits (positive signed integers)
>        clear_node_high_bit: yes
>
>        # Disable encryption
>        secauth:        off
>
>        # How many threads to use for encryption/decryption
>        threads:        0
>
>        # Optionally assign a fixed node id (integer)
>        # nodeid:         1234
>
>        interface {
>                ringnumber: 0
>
>                # The following values need to be set based on your
> environment
>                bindnetaddr: 192.168.67.0
>                mcastaddr: 226.94.1.1
>                mcastport: 5405
>        }
>
> logging {
>        debug: off
>        fileline: off
>        to_syslog: yes
>        to_stderr: off
>        syslog_facility: daemon
>        timestamp: on
> }
>
> amf {
>        mode: disabled
> }
>
>
> The first node is on 192.168.67.10 and the second on 192.168.67.11.
>
> Am i missing something?
>
> Thank you in advance and please forgive my lack of knowledge.
>
> Stratos.
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


strzol at gmail

Oct 12, 2009, 3:25 AM

Post #3 of 8 (1290 views)
Permalink
Re: No nodes appear [In reply to]

On Mon, Oct 12, 2009 at 10:16 AM, Andrew Beekhof <andrew [at] beekhof> wrote:

> On Mon, Oct 12, 2009 at 8:34 AM, Stratos Zolotas <strzol [at] gmail> wrote:
> > Hello to the list.
> >
> > Please excuse my ignorance, because it is the first time i try to built a
> > cluster.
> >
> > I'm trying to built a 2 node Active/Passive cluster with
> > DRBD+Pacemaker+Openais.
> >
> > I'm on the very beginning and i try to achieve initial communication
> between
> > the nodes.
> >
> > I'm getting the following:
> >
> > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: No STONITH
> > resources have been defined
> > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: Either
> > configure some or disable STONITH with the stonith-enabled option
> > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: NOTE:
> Clusters
> > with shared data need STONITH to ensure data integrity
> >
> >
> > ============
> > Last updated: Mon Oct 12 09:22:36 2009
> > Current DC: NONE
> > 0 Nodes configured, unknown expected votes
> > 0 Resources configured.
> > ============
>
> You might just need to wait a bit longer.
> But its hard to say without seeing all the logs. If you attach them
> (compressed) we'll be able to help further.
>
> > I don't care for the first three messages, because i haven't configure
> > anything yet, but it seems that i don't have communication between the
> > nodes. There is no any firewall and the communication is on a dedicated
> LAN.
> >
> > My openais.conf (identical for the two systems) is:
> >
> > alpha:/etc/ais # crm_mon --one-shot -V
> > # Please read the openais.conf.5 manual page
> >
> > aisexec {
> > # Run as root - this is necessary to be able to manage resources
> > with Pacemaker
> > user: root
> > group: root
> > }
> >
> > service {
> > # Load the Pacemaker Cluster Resource Manager
> > ver: 0
> > name: pacemaker
> > use_mgmtd: yes
> > use_logd: yes
> > }
> >
> > totem {
> > version: 2
> >
> > # How long before declaring a token lost (ms)
> > token: 5000
> >
> > # How many token retransmits before forming a new configuration
> > token_retransmits_before_loss_const: 10
> >
> > # How long to wait for join messages in the membership protocol
> (ms)
> > join: 1000
> >
> > # How long to wait for consensus to be achieved before starting a
> > new round of membership conf$
> > consensus: 2500
> >
> > # Turn off the virtual synchrony filter
> > vsftype: none
> >
> > # Number of messages that may be sent by one processor on receipt
> of
> > the token
> > max_messages: 20
> >
> > # Stagger sending the node join messages by 1..send_join ms
> > send_join: 45
> >
> > # Limit generated nodeids to 31-bits (positive signed integers)
> > clear_node_high_bit: yes
> >
> > # Disable encryption
> > secauth: off
> >
> > # How many threads to use for encryption/decryption
> > threads: 0
> >
> > # Optionally assign a fixed node id (integer)
> > # nodeid: 1234
> >
> > interface {
> > ringnumber: 0
> >
> > # The following values need to be set based on your
> > environment
> > bindnetaddr: 192.168.67.0
> > mcastaddr: 226.94.1.1
> > mcastport: 5405
> > }
> >
> > logging {
> > debug: off
> > fileline: off
> > to_syslog: yes
> > to_stderr: off
> > syslog_facility: daemon
> > timestamp: on
> > }
> >
> > amf {
> > mode: disabled
> > }
> >
> >
> > The first node is on 192.168.67.10 and the second on 192.168.67.11.
> >
> > Am i missing something?
> >
> > Thank you in advance and please forgive my lack of knowledge.
> >
> > Stratos.
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA [at] lists
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>

Thank you for your immediate response.

I think that something is wrong because i'm waiting for at least 2-3 hours
for the nodes to appear.

Please find the logs for the first machine (/var/log/messages) attached to
the message. If the logs from the second node are needed please ask me to
send them, but i think that the problem is common for both nodes.

I'm sending only the logs after the last run of openais (rcopenais start on
Opensuse 11.1)

Thank you again.


Stratos



--
Kernel IT Solutions Ltd
http://www.kernelit.gr

Cyclades Wireless Network
http://www.cywn.gr
Attachments: messages_last.zip (9.79 KB)


dejanmm at fastmail

Oct 12, 2009, 3:45 AM

Post #4 of 8 (1288 views)
Permalink
Re: No nodes appear [In reply to]

Hi,

On Mon, Oct 12, 2009 at 01:25:31PM +0300, Stratos Zolotas wrote:
> On Mon, Oct 12, 2009 at 10:16 AM, Andrew Beekhof <andrew [at] beekhof> wrote:
>
> > On Mon, Oct 12, 2009 at 8:34 AM, Stratos Zolotas <strzol [at] gmail> wrote:
> > > Hello to the list.
> > >
> > > Please excuse my ignorance, because it is the first time i try to built a
> > > cluster.
> > >
> > > I'm trying to built a 2 node Active/Passive cluster with
> > > DRBD+Pacemaker+Openais.
> > >
> > > I'm on the very beginning and i try to achieve initial communication
> > between
> > > the nodes.
> > >
> > > I'm getting the following:
> > >
> > > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: No STONITH
> > > resources have been defined
> > > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: Either
> > > configure some or disable STONITH with the stonith-enabled option
> > > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: NOTE:
> > Clusters
> > > with shared data need STONITH to ensure data integrity
> > >
> > >
> > > ============
> > > Last updated: Mon Oct 12 09:22:36 2009
> > > Current DC: NONE
> > > 0 Nodes configured, unknown expected votes
> > > 0 Resources configured.
> > > ============
> >
> > You might just need to wait a bit longer.
> > But its hard to say without seeing all the logs. If you attach them
> > (compressed) we'll be able to help further.
> >
> > > I don't care for the first three messages, because i haven't configure
> > > anything yet, but it seems that i don't have communication between the
> > > nodes. There is no any firewall and the communication is on a dedicated
> > LAN.
> > >
> > > My openais.conf (identical for the two systems) is:
> > >
> > > alpha:/etc/ais # crm_mon --one-shot -V
> > > # Please read the openais.conf.5 manual page
> > >
> > > aisexec {
> > > # Run as root - this is necessary to be able to manage resources
> > > with Pacemaker
> > > user: root
> > > group: root
> > > }
> > >
> > > service {
> > > # Load the Pacemaker Cluster Resource Manager
> > > ver: 0
> > > name: pacemaker
> > > use_mgmtd: yes
> > > use_logd: yes
> > > }
> > >
> > > totem {
> > > version: 2
> > >
> > > # How long before declaring a token lost (ms)
> > > token: 5000
> > >
> > > # How many token retransmits before forming a new configuration
> > > token_retransmits_before_loss_const: 10
> > >
> > > # How long to wait for join messages in the membership protocol
> > (ms)
> > > join: 1000
> > >
> > > # How long to wait for consensus to be achieved before starting a
> > > new round of membership conf$
> > > consensus: 2500
> > >
> > > # Turn off the virtual synchrony filter
> > > vsftype: none
> > >
> > > # Number of messages that may be sent by one processor on receipt
> > of
> > > the token
> > > max_messages: 20
> > >
> > > # Stagger sending the node join messages by 1..send_join ms
> > > send_join: 45
> > >
> > > # Limit generated nodeids to 31-bits (positive signed integers)
> > > clear_node_high_bit: yes
> > >
> > > # Disable encryption
> > > secauth: off
> > >
> > > # How many threads to use for encryption/decryption
> > > threads: 0
> > >
> > > # Optionally assign a fixed node id (integer)
> > > # nodeid: 1234
> > >
> > > interface {
> > > ringnumber: 0
> > >
> > > # The following values need to be set based on your
> > > environment
> > > bindnetaddr: 192.168.67.0
> > > mcastaddr: 226.94.1.1
> > > mcastport: 5405
> > > }
> > >
> > > logging {
> > > debug: off
> > > fileline: off
> > > to_syslog: yes
> > > to_stderr: off
> > > syslog_facility: daemon
> > > timestamp: on
> > > }
> > >
> > > amf {
> > > mode: disabled
> > > }
> > >
> > >
> > > The first node is on 192.168.67.10 and the second on 192.168.67.11.
> > >
> > > Am i missing something?
> > >
> > > Thank you in advance and please forgive my lack of knowledge.
> > >
> > > Stratos.
>
> Thank you for your immediate response.
>
> I think that something is wrong because i'm waiting for at least 2-3 hours
> for the nodes to appear.

That seems to be a bit excessive :)

> Please find the logs for the first machine (/var/log/messages) attached to
> the message. If the logs from the second node are needed please ask me to
> send them, but i think that the problem is common for both nodes.
>
> I'm sending only the logs after the last run of openais (rcopenais start on
> Opensuse 11.1)

There was a segfault in crmd/plumbing:

Oct 12 09:13:04 alpha kernel: crmd[11007]: segfault at 18 ip 00007f40ea896eee sp 00007fff0336a960 error 4 in libplumb.so.2.0.0[7f40ea87a000+30000]

You should capture the backtrace with gdb or use hb_report.
Hopefully there's a core file.

There won't be much of a cluster without crmd. Otherwise, openais
seems to function fine.

Thanks,

Dejan


> Thank you again.
>
>
> Stratos
>
>
>
> --
> Kernel IT Solutions Ltd
> http://www.kernelit.gr
>
> Cyclades Wireless Network
> http://www.cywn.gr


> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


strzol at gmail

Oct 12, 2009, 4:51 AM

Post #5 of 8 (1289 views)
Permalink
Re: No nodes appear [In reply to]

On Mon, Oct 12, 2009 at 1:45 PM, Dejan Muhamedagic <dejanmm [at] fastmail>wrote:

> Hi,
>
> On Mon, Oct 12, 2009 at 01:25:31PM +0300, Stratos Zolotas wrote:
> > On Mon, Oct 12, 2009 at 10:16 AM, Andrew Beekhof <andrew [at] beekhof>
> wrote:
> >
> > > On Mon, Oct 12, 2009 at 8:34 AM, Stratos Zolotas <strzol [at] gmail>
> wrote:
> > > > Hello to the list.
> > > >
> > > > Please excuse my ignorance, because it is the first time i try to
> built a
> > > > cluster.
> > > >
> > > > I'm trying to built a 2 node Active/Passive cluster with
> > > > DRBD+Pacemaker+Openais.
> > > >
> > > > I'm on the very beginning and i try to achieve initial communication
> > > between
> > > > the nodes.
> > > >
> > > > I'm getting the following:
> > > >
> > > > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: No
> STONITH
> > > > resources have been defined
> > > > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: Either
> > > > configure some or disable STONITH with the stonith-enabled option
> > > > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: NOTE:
> > > Clusters
> > > > with shared data need STONITH to ensure data integrity
> > > >
> > > >
> > > > ============
> > > > Last updated: Mon Oct 12 09:22:36 2009
> > > > Current DC: NONE
> > > > 0 Nodes configured, unknown expected votes
> > > > 0 Resources configured.
> > > > ============
> > >
> > > You might just need to wait a bit longer.
> > > But its hard to say without seeing all the logs. If you attach them
> > > (compressed) we'll be able to help further.
> > >
> > > > I don't care for the first three messages, because i haven't
> configure
> > > > anything yet, but it seems that i don't have communication between
> the
> > > > nodes. There is no any firewall and the communication is on a
> dedicated
> > > LAN.
> > > >
> > > > My openais.conf (identical for the two systems) is:
> > > >
> > > > alpha:/etc/ais # crm_mon --one-shot -V
> > > > # Please read the openais.conf.5 manual page
> > > >
> > > > aisexec {
> > > > # Run as root - this is necessary to be able to manage
> resources
> > > > with Pacemaker
> > > > user: root
> > > > group: root
> > > > }
> > > >
> > > > service {
> > > > # Load the Pacemaker Cluster Resource Manager
> > > > ver: 0
> > > > name: pacemaker
> > > > use_mgmtd: yes
> > > > use_logd: yes
> > > > }
> > > >
> > > > totem {
> > > > version: 2
> > > >
> > > > # How long before declaring a token lost (ms)
> > > > token: 5000
> > > >
> > > > # How many token retransmits before forming a new
> configuration
> > > > token_retransmits_before_loss_const: 10
> > > >
> > > > # How long to wait for join messages in the membership
> protocol
> > > (ms)
> > > > join: 1000
> > > >
> > > > # How long to wait for consensus to be achieved before
> starting a
> > > > new round of membership conf$
> > > > consensus: 2500
> > > >
> > > > # Turn off the virtual synchrony filter
> > > > vsftype: none
> > > >
> > > > # Number of messages that may be sent by one processor on
> receipt
> > > of
> > > > the token
> > > > max_messages: 20
> > > >
> > > > # Stagger sending the node join messages by 1..send_join ms
> > > > send_join: 45
> > > >
> > > > # Limit generated nodeids to 31-bits (positive signed
> integers)
> > > > clear_node_high_bit: yes
> > > >
> > > > # Disable encryption
> > > > secauth: off
> > > >
> > > > # How many threads to use for encryption/decryption
> > > > threads: 0
> > > >
> > > > # Optionally assign a fixed node id (integer)
> > > > # nodeid: 1234
> > > >
> > > > interface {
> > > > ringnumber: 0
> > > >
> > > > # The following values need to be set based on your
> > > > environment
> > > > bindnetaddr: 192.168.67.0
> > > > mcastaddr: 226.94.1.1
> > > > mcastport: 5405
> > > > }
> > > >
> > > > logging {
> > > > debug: off
> > > > fileline: off
> > > > to_syslog: yes
> > > > to_stderr: off
> > > > syslog_facility: daemon
> > > > timestamp: on
> > > > }
> > > >
> > > > amf {
> > > > mode: disabled
> > > > }
> > > >
> > > >
> > > > The first node is on 192.168.67.10 and the second on 192.168.67.11.
> > > >
> > > > Am i missing something?
> > > >
> > > > Thank you in advance and please forgive my lack of knowledge.
> > > >
> > > > Stratos.
> >
> > Thank you for your immediate response.
> >
> > I think that something is wrong because i'm waiting for at least 2-3
> hours
> > for the nodes to appear.
>
> That seems to be a bit excessive :)
>
> > Please find the logs for the first machine (/var/log/messages) attached
> to
> > the message. If the logs from the second node are needed please ask me to
> > send them, but i think that the problem is common for both nodes.
> >
> > I'm sending only the logs after the last run of openais (rcopenais start
> on
> > Opensuse 11.1)
>
> There was a segfault in crmd/plumbing:
>
> Oct 12 09:13:04 alpha kernel: crmd[11007]: segfault at 18 ip
> 00007f40ea896eee sp 00007fff0336a960 error 4 in
> libplumb.so.2.0.0[7f40ea87a000+30000]
>
> You should capture the backtrace with gdb or use hb_report.
> Hopefully there's a core file.
>
> There won't be much of a cluster without crmd. Otherwise, openais
> seems to function fine.
>
> Thanks,
>
> Dejan
>
>
> > Thank you again.
> >
> >
> > Stratos
> >
> >
> >
> > --
> > Kernel IT Solutions Ltd
> > http://www.kernelit.gr
> >
> > Cyclades Wireless Network
> > http://www.cywn.gr
>
>
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA [at] lists
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>


Thank you for your response. I have post the problem to the pacemaker's list
and if i discovered something i will report back.

Thanks again.


--
Kernel IT Solutions Ltd
http://www.kernelit.gr

Cyclades Wireless Network
http://www.cywn.gr
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


strzol at gmail

Oct 12, 2009, 9:19 AM

Post #6 of 8 (1295 views)
Permalink
Re: No nodes appear [In reply to]

On Mon, Oct 12, 2009 at 2:51 PM, Stratos Zolotas <strzol [at] gmail> wrote:

>
>
> On Mon, Oct 12, 2009 at 1:45 PM, Dejan Muhamedagic <dejanmm [at] fastmail>wrote:
>
>> Hi,
>>
>> On Mon, Oct 12, 2009 at 01:25:31PM +0300, Stratos Zolotas wrote:
>> > On Mon, Oct 12, 2009 at 10:16 AM, Andrew Beekhof <andrew [at] beekhof>
>> wrote:
>> >
>> > > On Mon, Oct 12, 2009 at 8:34 AM, Stratos Zolotas <strzol [at] gmail>
>> wrote:
>> > > > Hello to the list.
>> > > >
>> > > > Please excuse my ignorance, because it is the first time i try to
>> built a
>> > > > cluster.
>> > > >
>> > > > I'm trying to built a 2 node Active/Passive cluster with
>> > > > DRBD+Pacemaker+Openais.
>> > > >
>> > > > I'm on the very beginning and i try to achieve initial communication
>> > > between
>> > > > the nodes.
>> > > >
>> > > > I'm getting the following:
>> > > >
>> > > > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: No
>> STONITH
>> > > > resources have been defined
>> > > > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: Either
>> > > > configure some or disable STONITH with the stonith-enabled option
>> > > > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: NOTE:
>> > > Clusters
>> > > > with shared data need STONITH to ensure data integrity
>> > > >
>> > > >
>> > > > ============
>> > > > Last updated: Mon Oct 12 09:22:36 2009
>> > > > Current DC: NONE
>> > > > 0 Nodes configured, unknown expected votes
>> > > > 0 Resources configured.
>> > > > ============
>> > >
>> > > You might just need to wait a bit longer.
>> > > But its hard to say without seeing all the logs. If you attach them
>> > > (compressed) we'll be able to help further.
>> > >
>> > > > I don't care for the first three messages, because i haven't
>> configure
>> > > > anything yet, but it seems that i don't have communication between
>> the
>> > > > nodes. There is no any firewall and the communication is on a
>> dedicated
>> > > LAN.
>> > > >
>> > > > My openais.conf (identical for the two systems) is:
>> > > >
>> > > > alpha:/etc/ais # crm_mon --one-shot -V
>> > > > # Please read the openais.conf.5 manual page
>> > > >
>> > > > aisexec {
>> > > > # Run as root - this is necessary to be able to manage
>> resources
>> > > > with Pacemaker
>> > > > user: root
>> > > > group: root
>> > > > }
>> > > >
>> > > > service {
>> > > > # Load the Pacemaker Cluster Resource Manager
>> > > > ver: 0
>> > > > name: pacemaker
>> > > > use_mgmtd: yes
>> > > > use_logd: yes
>> > > > }
>> > > >
>> > > > totem {
>> > > > version: 2
>> > > >
>> > > > # How long before declaring a token lost (ms)
>> > > > token: 5000
>> > > >
>> > > > # How many token retransmits before forming a new
>> configuration
>> > > > token_retransmits_before_loss_const: 10
>> > > >
>> > > > # How long to wait for join messages in the membership
>> protocol
>> > > (ms)
>> > > > join: 1000
>> > > >
>> > > > # How long to wait for consensus to be achieved before
>> starting a
>> > > > new round of membership conf$
>> > > > consensus: 2500
>> > > >
>> > > > # Turn off the virtual synchrony filter
>> > > > vsftype: none
>> > > >
>> > > > # Number of messages that may be sent by one processor on
>> receipt
>> > > of
>> > > > the token
>> > > > max_messages: 20
>> > > >
>> > > > # Stagger sending the node join messages by 1..send_join ms
>> > > > send_join: 45
>> > > >
>> > > > # Limit generated nodeids to 31-bits (positive signed
>> integers)
>> > > > clear_node_high_bit: yes
>> > > >
>> > > > # Disable encryption
>> > > > secauth: off
>> > > >
>> > > > # How many threads to use for encryption/decryption
>> > > > threads: 0
>> > > >
>> > > > # Optionally assign a fixed node id (integer)
>> > > > # nodeid: 1234
>> > > >
>> > > > interface {
>> > > > ringnumber: 0
>> > > >
>> > > > # The following values need to be set based on your
>> > > > environment
>> > > > bindnetaddr: 192.168.67.0
>> > > > mcastaddr: 226.94.1.1
>> > > > mcastport: 5405
>> > > > }
>> > > >
>> > > > logging {
>> > > > debug: off
>> > > > fileline: off
>> > > > to_syslog: yes
>> > > > to_stderr: off
>> > > > syslog_facility: daemon
>> > > > timestamp: on
>> > > > }
>> > > >
>> > > > amf {
>> > > > mode: disabled
>> > > > }
>> > > >
>> > > >
>> > > > The first node is on 192.168.67.10 and the second on 192.168.67.11.
>> > > >
>> > > > Am i missing something?
>> > > >
>> > > > Thank you in advance and please forgive my lack of knowledge.
>> > > >
>> > > > Stratos.
>> >
>> > Thank you for your immediate response.
>> >
>> > I think that something is wrong because i'm waiting for at least 2-3
>> hours
>> > for the nodes to appear.
>>
>> That seems to be a bit excessive :)
>>
>> > Please find the logs for the first machine (/var/log/messages) attached
>> to
>> > the message. If the logs from the second node are needed please ask me
>> to
>> > send them, but i think that the problem is common for both nodes.
>> >
>> > I'm sending only the logs after the last run of openais (rcopenais start
>> on
>> > Opensuse 11.1)
>>
>> There was a segfault in crmd/plumbing:
>>
>> Oct 12 09:13:04 alpha kernel: crmd[11007]: segfault at 18 ip
>> 00007f40ea896eee sp 00007fff0336a960 error 4 in
>> libplumb.so.2.0.0[7f40ea87a000+30000]
>>
>> You should capture the backtrace with gdb or use hb_report.
>> Hopefully there's a core file.
>>
>> There won't be much of a cluster without crmd. Otherwise, openais
>> seems to function fine.
>>
>> Thanks,
>>
>> Dejan
>>
>>
>> > Thank you again.
>> >
>> >
>> > Stratos
>> >
>> >
>> >
>> > --
>> > Kernel IT Solutions Ltd
>> > http://www.kernelit.gr
>> >
>> > Cyclades Wireless Network
>> > http://www.cywn.gr
>>
>>
>> > _______________________________________________
>> > Linux-HA mailing list
>> > Linux-HA [at] lists
>> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> > See also: http://linux-ha.org/ReportingProblems
>>
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA [at] lists
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>
>
> Thank you for your response. I have post the problem to the pacemaker's
> list and if i discovered something i will report back.
>
> Thanks again.
>
>
>
> --
> Kernel IT Solutions Ltd
> http://www.kernelit.gr
>
> Cyclades Wireless Network
> http://www.cywn.gr
>



I have uninstall and reinstall all the packages (same versions) and for now
i have no segfaults (at least for the last 45 minutes).

I'm still waiting to see the nodes on crm_mon.

Should i check something? I think it is enough time for the nodes to appear,
so something must be wrong.

I'm attaching the log after the last run of openais.

Thanks again.

Stratos
Attachments: messages_last.zip (4.78 KB)


andrew at beekhof

Nov 9, 2009, 5:01 AM

Post #7 of 8 (1023 views)
Permalink
Re: No nodes appear [In reply to]

On Mon, Oct 12, 2009 at 5:19 PM, Stratos Zolotas <strzol [at] gmail> wrote:
> I have uninstall and reinstall all the packages (same versions) and for now
> i have no segfaults (at least for the last 45 minutes).
>
> I'm still waiting to see the nodes on crm_mon.
>
> Should i check something? I think it is enough time for the nodes to appear,
> so something must be wrong.

Is the crmd process still running?
There should be way more logging from it than there is.
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


strzol at gmail

Nov 9, 2009, 5:05 AM

Post #8 of 8 (1015 views)
Permalink
Re: No nodes appear [In reply to]

Hi,

Andrew, everything is ok. We had continued the discussion on another thread.
After a clean install of the system and the necessary packages, all work
nice. The 2-node cluster is active for about 15 days now.

Thanks.

On Mon, Nov 9, 2009 at 3:01 PM, Andrew Beekhof <andrew [at] beekhof> wrote:

> On Mon, Oct 12, 2009 at 5:19 PM, Stratos Zolotas <strzol [at] gmail> wrote:
> > I have uninstall and reinstall all the packages (same versions) and for
> now
> > i have no segfaults (at least for the last 45 minutes).
> >
> > I'm still waiting to see the nodes on crm_mon.
> >
> > Should i check something? I think it is enough time for the nodes to
> appear,
> > so something must be wrong.
>
> Is the crmd process still running?
> There should be way more logging from it than there is.
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Linux-HA users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.