
renayama19661014 at ybb
Jul 27, 2011, 6:04 PM
Post #9 of 38
(922 views)
Permalink
|
|
Re: The active trap of the SNMP is delayed.
[In reply to]
|
|
Hi Yan, > However, it is F_STATUS message of the considerably first stage that hbagent performs queueing . > I pinpoint which hb_api of hbagent it is. I confirmed it. It is like the get_uuid processing that F_STATUS message is performed queueing of. --- The next log added FUNCTION macro to a summons of read_api_msg. --- --- get_uuid is reflected on the first log. --- Jul 28 18:51:03 srv01 lha-snmpagent: [6538]: info: ##### yamuchi enqure_msg (): get_uuid ##### Jul 28 18:51:03 srv01 lha-snmpagent: [6538]: info: MSG: Dumping message with 12 fields Jul 28 18:51:03 srv01 lha-snmpagent: [6538]: info: MSG[0] : [t=status] Jul 28 18:51:03 srv01 lha-snmpagent: [6538]: info: MSG[1] : [st=active] Jul 28 18:51:04 srv01 lha-snmpagent: [6538]: info: MSG[2] : [dt=6590] Jul 28 18:51:04 srv01 lha-snmpagent: [6538]: info: MSG[3] : [protocol=1] Jul 28 18:51:04 srv01 lha-snmpagent: [6538]: info: MSG[4] : [src=srv02] Jul 28 18:51:04 srv01 lha-snmpagent: [6538]: info: MSG[5] : [(1)srcuuid=0x889db30(36 27)] Jul 28 18:51:04 srv01 lha-snmpagent: [6538]: info: MSG[6] : [seq=6] Jul 28 18:51:04 srv01 lha-snmpagent: [6538]: info: MSG[7] : [hg=4ddb3649] Jul 28 18:51:04 srv01 lha-snmpagent: [6538]: info: MSG[8] : [ts=4e313107] Jul 28 18:51:04 srv01 lha-snmpagent: [6538]: info: MSG[9] : [ld=0.16 0.04 0.01 2/89 6264] Jul 28 18:51:04 srv01 lha-snmpagent: [6538]: info: MSG[10] : [ttl=3] Jul 28 18:51:04 srv01 lha-snmpagent: [6538]: info: MSG[11] : [auth=1 60410427f13e2377858cc0e403a8014c4704ab36] In hb_agent, I think that cueing is considered to be it at the time of either next summons. (snip) int init_heartbeat(void) { (snip) /* * get uuid for trap message. * see: hbagentv2_update_diff() in hbagentv2.c */ if (hb->llc_ops->get_uuid_by_name(hb, myid, &uuid) == HA_FAIL) { cl_log(LOG_ERR, "Cannot get mynodeid"); cl_log(LOG_ERR, "REASON: %s", hb->llc_ops->errmsg(hb)); return HA_FAIL; } (snip) int walk_nodetable(void) { (snip) #ifdef HAVE_NEW_HB_API /* the get_uuid_by_name is not available for STABLE_1_2 branch. */ if (hb->llc_ops->get_uuid_by_name(hb, name, &uuid) == HA_FAIL) { cl_log(LOG_DEBUG, "Cannot get the uuid for node: %s", name); } #endif /* HAVE_NEW_HB_API */ (snip) Best Regards, Hideo Yamauchi. --- On Thu, 2011/7/28, renayama19661014 [at] ybb <renayama19661014 [at] ybb> wrote: > Hi Yan, > > Thank you for comment. > > > > Hi Lars, > > > Hi All, > > > > > > A cause to be delayed became clear. > > > > > > This problem occurs by a timing. > > > > > > When hbagent receives F_STATUS message while hbagent waits for a reply of the api communication, > > Under this circumstance, is there a specific heartbeat op that hbagent > > is waiting for? > > Yes. > > However, it is F_STATUS message of the considerably first stage that hbagent performs queueing . > I pinpoint which hb_api of hbagent it is. > > When I made the following modifications, it was over log of the queueing . > > (snip) > /* > * Read an API message. All other messages are enqueued to be read later. > */ > static struct ha_msg * > read_api_msg(llc_private_t* pi) > { > for (;;) { > struct ha_msg* msg; > const char * type; > pi->chan->ops->waitin(pi->chan); > if (pi->chan->ch_status == IPC_DISCONNECT){ > break; > } > if ((msg=msgfromIPC(pi->chan, 0)) == NULL) { > ha_api_perror("read_api_msg: " "Cannot read reply from IPC channel"); > continue; > } > if ((type=ha_msg_value(msg, F_TYPE)) != NULL && strcmp(type, T_APIRESP) == 0) { > return(msg); > } > /* Got an unexpected non-api message */ > /* Queue it up for reading later */ > /* yamauchi */ > if (strcasecmp(ha_msg_value(msg, F_TYPE),T_STATUS) == 0) { > cl_log(LOG_INFO, "##### yamuchi enqure_msg ()#####"); > cl_log_message(LOG_INFO, msg); > } > enqueue_msg(pi, msg); > } > /*NOTREACHED*/ > return(NULL); > } > > (snip) > Jul 27 19:13:50 srv01 ccm: [5432]: info: ##### yamuchi enqure_msg ()##### > Jul 27 19:13:50 srv01 ccm: [5432]: info: MSG: Dumping message with 12 fields > Jul 27 19:13:50 srv01 ccm: [5432]: info: MSG[0] : [t=status] > Jul 27 19:13:50 srv01 ccm: [5432]: info: MSG[1] : [st=active] > Jul 27 19:13:50 srv01 ccm: [5432]: info: MSG[2] : [dt=6590] > Jul 27 19:13:50 srv01 ccm: [5432]: info: MSG[3] : [protocol=1] > Jul 27 19:13:50 srv01 ccm: [5432]: info: MSG[4] : [src=srv02] > Jul 27 19:13:50 srv01 ccm: [5432]: info: MSG[5] : [(1)srcuuid=0xa006540(36 27)] > Jul 27 19:13:50 srv01 ccm: [5432]: info: MSG[6] : [seq=6] > Jul 27 19:13:50 srv01 lha-snmpagent: [5438]: info: ##### yamuchi enqure_msg ()##### > Jul 27 19:13:50 srv01 stonithd: [5435]: info: ##### yamuchi enqure_msg ()##### > Jul 27 19:13:50 srv01 ccm: [5432]: info: MSG[7] : [hg=4ddb3648] > Jul 27 19:13:50 srv01 lha-snmpagent: [5438]: info: MSG: Dumping message with 12 fields > Jul 27 19:13:50 srv01 stonithd: [5435]: info: MSG: Dumping message with 12 fields > Jul 27 19:13:50 srv01 ccm: [5432]: info: MSG[8] : [ts=4e2fe4dd] > Jul 27 19:13:50 srv01 lha-snmpagent: [5438]: info: MSG[0] : [t=status] > Jul 27 19:13:50 srv01 stonithd: [5435]: info: MSG[0] : [t=status] > Jul 27 19:13:50 srv01 ccm: [5432]: info: MSG[9] : [ld=0.04 0.12 0.15 3/89 5394] > Jul 27 19:13:50 srv01 lha-snmpagent: [5438]: info: MSG[1] : [st=active] > Jul 27 19:13:50 srv01 stonithd: [5435]: info: MSG[1] : [st=active] > Jul 27 19:13:50 srv01 ccm: [5432]: info: MSG[10] : [ttl=3] > Jul 27 19:13:50 srv01 lha-snmpagent: [5438]: info: MSG[2] : [dt=6590] > Jul 27 19:13:50 srv01 stonithd: [5435]: info: MSG[2] : [dt=6590] > Jul 27 19:13:50 srv01 ccm: [5432]: info: MSG[11] : [auth=1 69619762aa14655cdccd9778ec4c4861a15a0f19] > Jul 27 19:13:50 srv01 lha-snmpagent: [5438]: info: MSG[3] : [protocol=1] > Jul 27 19:13:50 srv01 stonithd: [5435]: info: MSG[3] : [protocol=1] > Jul 27 19:13:50 srv01 lha-snmpagent: [5438]: info: MSG[4] : [src=srv02] > Jul 27 19:13:50 srv01 stonithd: [5435]: info: MSG[4] : [src=srv02] > Jul 27 19:13:50 srv01 lha-snmpagent: [5438]: info: MSG[5] : [(1)srcuuid=0x84255e0(36 27)] > Jul 27 19:13:50 srv01 stonithd: [5435]: info: MSG[5] : [(1)srcuuid=0x83b7bf8(36 27)] > Jul 27 19:13:50 srv01 lha-snmpagent: [5438]: info: MSG[6] : [seq=6] > Jul 27 19:13:50 srv01 stonithd: [5435]: info: MSG[6] : [seq=6] > Jul 27 19:13:50 srv01 lha-snmpagent: [5438]: info: MSG[7] : [hg=4ddb3648] > Jul 27 19:13:50 srv01 stonithd: [5435]: info: MSG[7] : [hg=4ddb3648] > Jul 27 19:13:50 srv01 lha-snmpagent: [5438]: info: MSG[8] : [ts=4e2fe4dd] > Jul 27 19:13:50 srv01 stonithd: [5435]: info: MSG[8] : [ts=4e2fe4dd] > Jul 27 19:13:50 srv01 lha-snmpagent: [5438]: info: MSG[9] : [ld=0.04 0.12 0.15 3/89 5394] > Jul 27 19:13:50 srv01 stonithd: [5435]: info: MSG[9] : [ld=0.04 0.12 0.15 3/89 5394] > Jul 27 19:13:50 srv01 lha-snmpagent: [5438]: info: MSG[10] : [ttl=3] > Jul 27 19:13:50 srv01 stonithd: [5435]: info: MSG[10] : [ttl=3] > Jul 27 19:13:50 srv01 lha-snmpagent: [5438]: info: MSG[11] : [auth=1 69619762aa14655cdccd9778ec4c4861a15a0f19] > Jul 27 19:13:50 srv01 stonithd: [5435]: info: MSG[11] : [auth=1 69619762aa14655cdccd9778ec4c4861a15a0f19] > (snip) > Jul 27 19:13:52 srv01 cib: [5433]: info: ##### yamuchi enqure_msg ()##### > Jul 27 19:13:52 srv01 cib: [5433]: info: MSG: Dumping message with 12 fields > Jul 27 19:13:52 srv01 cib: [5433]: info: MSG[0] : [t=status] > Jul 27 19:13:52 srv01 cib: [5433]: info: MSG[1] : [st=active] > Jul 27 19:13:52 srv01 cib: [5433]: info: MSG[2] : [dt=6590] > Jul 27 19:13:52 srv01 cib: [5433]: info: MSG[3] : [protocol=1] > Jul 27 19:13:52 srv01 cib: [5433]: info: MSG[4] : [src=srv02] > Jul 27 19:13:52 srv01 cib: [5433]: info: MSG[5] : [(1)srcuuid=0x8fc9060(36 27)] > Jul 27 19:13:52 srv01 cib: [5433]: info: MSG[6] : [seq=6] > Jul 27 19:13:52 srv01 cib: [5433]: info: MSG[7] : [hg=4ddb3648] > Jul 27 19:13:52 srv01 cib: [5433]: info: MSG[8] : [ts=4e2fe4dd] > Jul 27 19:13:52 srv01 cib: [5433]: info: MSG[9] : [ld=0.04 0.12 0.15 3/89 5394] > Jul 27 19:13:52 srv01 cib: [5433]: info: MSG[10] : [ttl=3] > Jul 27 19:13:52 srv01 cib: [5433]: info: MSG[11] : [auth=1 69619762aa14655cdccd9778ec4c4861a15a0f19] > (snip) > > > > > > > F_STATUS is performed queueing of. > > > > > > When hbagent caught the event from Heartbeat, this message is handled. > > > Therefore, it is handled at the time of events such as one down of the inter-connect. > > > > > > Therefore, the active trap of the node is transmitted when inter-connect fell. > > > > > > /* > > > * Read an API message. All other messages are enqueued to be read later. > > > */ > > > static struct ha_msg * > > > read_api_msg(llc_private_t* pi) > > > { > > > > > > for (;;) { > > > struct ha_msg* msg; > > > const char * type; > > > > > > pi->chan->ops->waitin(pi->chan); > > > if (pi->chan->ch_status == IPC_DISCONNECT){ > > > break; > > > } > > > if ((msg=msgfromIPC(pi->chan, 0)) == NULL) { > > > ha_api_perror("read_api_msg: " > > > "Cannot read reply from IPC channel"); > > > continue; > > > } > > > if ((type=ha_msg_value(msg, F_TYPE)) != NULL > > > && strcmp(type, T_APIRESP) == 0) { > > > return(msg); > > > } > > > /* Got an unexpected non-api message */ > > > /* Queue it up for reading later */ > > > enqueue_msg(pi, msg); > > > } > > > /*NOTREACHED*/ > > > return(NULL); > > > } > > > > > > > > > > > > I think that the following correction is necessary. > > > snmp_subagent/hbagent.c > > > (snip) > > > } else { > > > > > > /* snmp request */ > > > snmp_read(&fdset); > > > > > > ret = handle_heartbeat_msg(); ----> read queueing msg.!! > > > } > > > (snip) > > I'm still confused about invoking handle_heartbreat_msg() when select() > > finds that the SNMP socket has input. Is it an appropriate timing? > > Sorry.... > > This correction is one example. > Because I do not know a lot about handling of hbagent, I demand the instructions of your right correction. > > Best Regards, > Hideo Yamauchi. > > > > > Regards, > > Yan > > -- > > Gao,Yan <ygao [at] suse> > > Software Engineer > > China Server Team, SUSE. > > _______________________________________________ > > Linux-HA mailing list > > Linux-HA [at] lists > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > _______________________________________________ > Linux-HA mailing list > Linux-HA [at] lists > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list Linux-HA [at] lists http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
|