
strzol at gmail
Oct 12, 2009, 9:19 AM
Post #6 of 8
(1295 views)
Permalink
|
On Mon, Oct 12, 2009 at 2:51 PM, Stratos Zolotas <strzol [at] gmail> wrote: > > > On Mon, Oct 12, 2009 at 1:45 PM, Dejan Muhamedagic <dejanmm [at] fastmail>wrote: > >> Hi, >> >> On Mon, Oct 12, 2009 at 01:25:31PM +0300, Stratos Zolotas wrote: >> > On Mon, Oct 12, 2009 at 10:16 AM, Andrew Beekhof <andrew [at] beekhof> >> wrote: >> > >> > > On Mon, Oct 12, 2009 at 8:34 AM, Stratos Zolotas <strzol [at] gmail> >> wrote: >> > > > Hello to the list. >> > > > >> > > > Please excuse my ignorance, because it is the first time i try to >> built a >> > > > cluster. >> > > > >> > > > I'm trying to built a 2 node Active/Passive cluster with >> > > > DRBD+Pacemaker+Openais. >> > > > >> > > > I'm on the very beginning and i try to achieve initial communication >> > > between >> > > > the nodes. >> > > > >> > > > I'm getting the following: >> > > > >> > > > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: No >> STONITH >> > > > resources have been defined >> > > > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: Either >> > > > configure some or disable STONITH with the stonith-enabled option >> > > > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: NOTE: >> > > Clusters >> > > > with shared data need STONITH to ensure data integrity >> > > > >> > > > >> > > > ============ >> > > > Last updated: Mon Oct 12 09:22:36 2009 >> > > > Current DC: NONE >> > > > 0 Nodes configured, unknown expected votes >> > > > 0 Resources configured. >> > > > ============ >> > > >> > > You might just need to wait a bit longer. >> > > But its hard to say without seeing all the logs. If you attach them >> > > (compressed) we'll be able to help further. >> > > >> > > > I don't care for the first three messages, because i haven't >> configure >> > > > anything yet, but it seems that i don't have communication between >> the >> > > > nodes. There is no any firewall and the communication is on a >> dedicated >> > > LAN. >> > > > >> > > > My openais.conf (identical for the two systems) is: >> > > > >> > > > alpha:/etc/ais # crm_mon --one-shot -V >> > > > # Please read the openais.conf.5 manual page >> > > > >> > > > aisexec { >> > > > # Run as root - this is necessary to be able to manage >> resources >> > > > with Pacemaker >> > > > user: root >> > > > group: root >> > > > } >> > > > >> > > > service { >> > > > # Load the Pacemaker Cluster Resource Manager >> > > > ver: 0 >> > > > name: pacemaker >> > > > use_mgmtd: yes >> > > > use_logd: yes >> > > > } >> > > > >> > > > totem { >> > > > version: 2 >> > > > >> > > > # How long before declaring a token lost (ms) >> > > > token: 5000 >> > > > >> > > > # How many token retransmits before forming a new >> configuration >> > > > token_retransmits_before_loss_const: 10 >> > > > >> > > > # How long to wait for join messages in the membership >> protocol >> > > (ms) >> > > > join: 1000 >> > > > >> > > > # How long to wait for consensus to be achieved before >> starting a >> > > > new round of membership conf$ >> > > > consensus: 2500 >> > > > >> > > > # Turn off the virtual synchrony filter >> > > > vsftype: none >> > > > >> > > > # Number of messages that may be sent by one processor on >> receipt >> > > of >> > > > the token >> > > > max_messages: 20 >> > > > >> > > > # Stagger sending the node join messages by 1..send_join ms >> > > > send_join: 45 >> > > > >> > > > # Limit generated nodeids to 31-bits (positive signed >> integers) >> > > > clear_node_high_bit: yes >> > > > >> > > > # Disable encryption >> > > > secauth: off >> > > > >> > > > # How many threads to use for encryption/decryption >> > > > threads: 0 >> > > > >> > > > # Optionally assign a fixed node id (integer) >> > > > # nodeid: 1234 >> > > > >> > > > interface { >> > > > ringnumber: 0 >> > > > >> > > > # The following values need to be set based on your >> > > > environment >> > > > bindnetaddr: 192.168.67.0 >> > > > mcastaddr: 226.94.1.1 >> > > > mcastport: 5405 >> > > > } >> > > > >> > > > logging { >> > > > debug: off >> > > > fileline: off >> > > > to_syslog: yes >> > > > to_stderr: off >> > > > syslog_facility: daemon >> > > > timestamp: on >> > > > } >> > > > >> > > > amf { >> > > > mode: disabled >> > > > } >> > > > >> > > > >> > > > The first node is on 192.168.67.10 and the second on 192.168.67.11. >> > > > >> > > > Am i missing something? >> > > > >> > > > Thank you in advance and please forgive my lack of knowledge. >> > > > >> > > > Stratos. >> > >> > Thank you for your immediate response. >> > >> > I think that something is wrong because i'm waiting for at least 2-3 >> hours >> > for the nodes to appear. >> >> That seems to be a bit excessive :) >> >> > Please find the logs for the first machine (/var/log/messages) attached >> to >> > the message. If the logs from the second node are needed please ask me >> to >> > send them, but i think that the problem is common for both nodes. >> > >> > I'm sending only the logs after the last run of openais (rcopenais start >> on >> > Opensuse 11.1) >> >> There was a segfault in crmd/plumbing: >> >> Oct 12 09:13:04 alpha kernel: crmd[11007]: segfault at 18 ip >> 00007f40ea896eee sp 00007fff0336a960 error 4 in >> libplumb.so.2.0.0[7f40ea87a000+30000] >> >> You should capture the backtrace with gdb or use hb_report. >> Hopefully there's a core file. >> >> There won't be much of a cluster without crmd. Otherwise, openais >> seems to function fine. >> >> Thanks, >> >> Dejan >> >> >> > Thank you again. >> > >> > >> > Stratos >> > >> > >> > >> > -- >> > Kernel IT Solutions Ltd >> > http://www.kernelit.gr >> > >> > Cyclades Wireless Network >> > http://www.cywn.gr >> >> >> > _______________________________________________ >> > Linux-HA mailing list >> > Linux-HA [at] lists >> > http://lists.linux-ha.org/mailman/listinfo/linux-ha >> > See also: http://linux-ha.org/ReportingProblems >> >> _______________________________________________ >> Linux-HA mailing list >> Linux-HA [at] lists >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> See also: http://linux-ha.org/ReportingProblems >> > > > Thank you for your response. I have post the problem to the pacemaker's > list and if i discovered something i will report back. > > Thanks again. > > > > -- > Kernel IT Solutions Ltd > http://www.kernelit.gr > > Cyclades Wireless Network > http://www.cywn.gr > I have uninstall and reinstall all the packages (same versions) and for now i have no segfaults (at least for the last 45 minutes). I'm still waiting to see the nodes on crm_mon. Should i check something? I think it is enough time for the nodes to appear, so something must be wrong. I'm attaching the log after the last run of openais. Thanks again. Stratos
|