Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: RSyslog: users

rsyslog 5.8.13 omgssapi shutdown segfault

 

 

RSyslog users RSS feed   Index | Next | Previous | View Threaded


bodik at civ

Jul 12, 2012, 2:47 AM

Post #1 of 4 (263 views)
Permalink
rsyslog 5.8.13 omgssapi shutdown segfault

Hello,

it seems that during rsyslog shutdown, my main msg queue is terminated before
omgssapi.c/gss-misc.c which want's to LogError into main queue during. omgssapi
action uses DA queue, and this happens only if there are some undelivered
messages in the time of shutdown and remote server cannot be reached (tcpkill).

it's not reproducible by omfwd since it does not use any LogError() (nor does
omrelp) ... I could try to find another codeflow, which triggers this bug...

does anyone have a clue where to look for ?

a) is it better to chech some "terminate_imediatlety flag" in plugin:doTryResume ?
(but imho it is not done by any other module) and return with SUSPENDED or
something like that ?

or

b) check a null pointer in qqueueEnqObj

?

thanks for any reply
bodik

------------
Program terminated with signal 11, Segmentation fault.
#0 0x000000000044a66d in qqueueEnqObj (pThis=0x0,
flowCtlType=eFLOWCTL_NO_DELAY, pUsr=0x7f208c006200) at queue.c:2442
2442 ISOBJ_TYPE_assert(pThis, qqueue);
> back
#0 0x000000000044a66d in qqueueEnqObj (pThis=0x0,
flowCtlType=eFLOWCTL_NO_DELAY, pUsr=0x7f208c006200) at queue.c:2442
#1 0x000000000040ca85 in submitMsg (pMsg=0x7f208c006200) at syslogd.c:731
#2 0x000000000040c605 in logmsgInternal (iErr=-1, pri=43, msg=0x7f209165bf80
"GSS-API error reading token length", flags=1) at syslogd.c:583
#3 0x000000000040c424 in submitErrMsg (iErr=-1, msg=0x7f209165bf80 "GSS-API
error reading token length") at syslogd.c:527
#4 0x000000000045c934 in LogError (iErrno=0, iErrCode=-1, fmt=0x7f20920648b8
"GSS-API error reading token length") at errmsg.c:102
#5 0x00007f20920640c3 in recv_token (s=4, tok=0x7f209165c900) at gss-misc.c:180
#6 0x00007f20933b487b in TCPSendGSSInit (pvData=0x1f90110) at omgssapi.c:369
#7 0x00007f20933b49d2 in doTryResume (pData=0x1f90110) at omgssapi.c:413
#8 0x00007f20933b4a58 in tryResume (pData=0x1f90110) at omgssapi.c:430
#9 0x0000000000451a17 in actionDoRetry (pThis=0x1f901d0, ttNow=1342078921,
pbShutdownImmediate=0x1f90330) at ../action.c:574
#10 0x0000000000451c09 in actionTryResume (pThis=0x1f901d0,
pbShutdownImmediate=0x1f90330) at ../action.c:638
#11 0x0000000000451cdd in actionPrepare (pThis=0x1f901d0,
pbShutdownImmediate=0x1f90330) at ../action.c:660
#12 0x00000000004524c9 in actionProcessMessage (pThis=0x1f901d0, pMsg=0x1fb66a0,
actParams=0x1f907e0, pbShutdownImmediate=0x1f90330) at ../action.c:877
#13 0x000000000045281a in tryDoAction (pAction=0x1f901d0, pBatch=0x1f8ff60,
pnElem=0x7f209165cb5c) at ../action.c:974
...
...
...


4711.009991685:7f9320be1700: regular consumer finished, iret=4, szlog 0 sz phys 0
4711.010000838:7f9320be1700: wti 0x10b8cd0: terminating worker terminateRet=5,
bInactivityTOOccured=0
4711.010013810:7f9320be1700: main Q:Reg: Worker thread 10b8cd0, terminated, num
workers now 0
4711.010044308:7f9323f9e700: main Q: regular queue workers shut down.
4711.010069488:7f9323f9e700: main Q: checking to see if we need to cancel any
worker threads of the primary queue
4711.010088162:7f9323f9e700: main Q: worker threads terminated, remaining queue
size log 0, phys 0.
4711.010114613:7f9323f9e700: main Q: queue (type 0) will lose 0 messages,
destroying...
4711.010130168:7f9323f9e700: Terminating outputs...
4711.010140849:7f9323f9e700: destructing ruleset 0x10b8aa0, name 0x10b7d10
4711.010151266:7f9323f9e700: action 1 queue: initiating worker thread shutdown
sequence
4711.010163114:7f9323f9e700: action 1 queue: setting EnqOnly mode for DA worker
4711.010175122:7f9323f9e700: action 1 queue:DAwpool: high activity - starting 1
additional worker thread(s).
4711.010227003:7f9323f9e700: action 1 queue:DAwpool: started with state 0, num
workers now 1
4711.010237927:7f9323f9e700: action 1 queue: awoke DA worker, told it to shut down.
4711.010249825:7f9323f9e700: action 1 queue[DA]:Reg: high activity - starting 1
additional worker thread(s).
4711.010279824:7f9323f9e700: action 1 queue[DA]:Reg: started with state 0, num
workers now 1
4711.010289965:7f9323f9e700: action 1 queue: awoke DA queue regular worker, told
it to shut down when done.
4711.010309359:7f9323f9e700: action 1 queue: trying shutdown of regular workers
4711.010327980:7f9320be1700: we deleted 0 objects and enqueued 0 objects
4711.010342427:7f9320be1700: delete batch from store, new sizes: log 16, phys 17
4711.010354076:7f9320be1700: wti 0x10c3cf0: terminating worker because of
TERMINATE_NOW mode, del iRet 0
4711.010370187:7f9320be1700: action 1 queue:DAwpool: Worker thread 10c3cf0,
terminated, num workers now 0
4711.010430302:7f931efd0700: we deleted 0 objects and enqueued 0 objects
4711.010442252:7f931efd0700: delete batch from store, new sizes: log 0, phys 0
4711.010452419:7f931efd0700: regular consumer finished, iret=4, szlog 0 sz phys 0
4711.010469142:7f931efd0700: wti 0x10c7c40: terminating worker terminateRet=5,
bInactivityTOOccured=0
4711.010481825:7f931efd0700: action 1 queue[DA]:Reg: Worker thread 10c7c40,
terminated, num workers now 0
4711.010495003:7f9323f9e700: sent SIGTTIN to worker thread 0x203e0700
4711.010517655:7f93203e0700: nympha18-2.zcu.cz
4711.010541926:7f93203e0700: GSS-API service name: host [at] server
4711.010561175:7f93203e0700: GSS-API requested context flags:
4711.010576324:7f93203e0700: GSS_C_MUTUAL_FLAG
4711.010592603:7f93203e0700: GSS_C_CONF_FLAG
4711.010618011:7f9323f9e700: action 1 queue:Reg: waiting 0ms on worker thread
termination, 1 still running
4711.010654407:7f9323f9e700: action 1 queue:Reg: timeout waiting on worker
thread termination
4711.010669768:7f9323f9e700: sent SIGTTIN to worker thread 0x203e0700
4711.010680122:7f9323f9e700: action 1 queue: regular shutdown timed out on
primary queue (this is OK)
4711.010691574:7f9323f9e700: action 1 queue: we have a DA queue (0x10c4170),
requesting its shutdown.
4711.010702844:7f9323f9e700: action 1 queue: trying shutdown of regular worker
of DA queue
4711.010714846:7f9323f9e700: action 1 queue: DA queue worker shut down.
4711.010726277:7f9323f9e700: action 1 queue: setting EnqOnly mode
4711.010738423:7f9323f9e700: action 1 queue: trying immediate shutdown of
regular workers (if any)
4711.010756246:7f9323f9e700: sent SIGTTIN to worker thread 0x203e0700
4711.010766924:7f9323f9e700: action 1 queue:Reg: waiting 1000ms on worker thread
termination, 1 still running
4711.012986201:7f93203e0700: caller requested object 'nsd_ptcp', not found (iRet
-3003)
4711.013001748:7f93203e0700: Requested to load module 'lmnsd_ptcp'
4711.013014537:7f93203e0700: loading module '/usr/lib/rsyslog/lmnsd_ptcp.so'
4711.013148960:7f93203e0700: source file nsd_ptcp.c requested reference for
module 'lmnetstrms', reference count now 3
4711.013174112:7f93203e0700: module of type 2 being loaded.
4711.013183965:7f93203e0700: entry point 'isCompatibleWithFeature' not present
in module
4711.013194963:7f93203e0700: source file netstrms.c requested reference for
module 'lmnsd_ptcp', reference count now 1
4711.013959065:7f93203e0700: GSS-API Sending init_sec_context token (length: 615)
4711.014002308:7f93203e0700: GSS-API Continue needed...
4711.014402331:7f93203e0700: Called LogError, msg: GSS-API error reading token
length



_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards


bodik at civ

Jul 12, 2012, 6:23 AM

Post #2 of 4 (222 views)
Permalink
Re: rsyslog 5.8.13 omgssapi shutdown segfault [In reply to]

On 07/12/12 11:47, bodik wrote:
> Hello,
>
> it seems that during rsyslog shutdown, my main msg queue is terminated before
> omgssapi.c/gss-misc.c which want's to LogError into main queue during. omgssapi
> action uses DA queue, and this happens only if there are some undelivered
> messages in the time of shutdown and remote server cannot be reached (tcpkill).
>
> does anyone have a clue where to look for ?
>
> a) is it better to chech some "terminate_imediatlety flag" in plugin:doTryResume ?
> (but imho it is not done by any other module) and return with SUSPENDED or
> something like that ?

that doesn't help, i tried if(glbl.GetGlobalInputTermState()) but that leads to
other troubles ...

> or
>
> b) check a null pointer in qqueueEnqObj

this works

diff -rua a/tools/syslogd.c b/tools/syslogd.c
--- a/tools/syslogd.c 2012-07-12 15:04:26.080620075 +0200
+++ b/tools/syslogd.c 2012-07-12 15:03:41.772872922 +0200
@@ -727,8 +727,12 @@
pRuleset = MsgGetRuleset(pMsg);

pQueue = (pRuleset == NULL) ? pMsgQueue : ruleset.GetRulesetQueue(pRuleset);
- MsgPrepareEnqueue(pMsg);
- qqueueEnqObj(pQueue, pMsg->flowCtlType, (void*) pMsg);
+ if( pQueue != NULL ) {
+ MsgPrepareEnqueue(pMsg);
+ qqueueEnqObj(pQueue, pMsg->flowCtlType, (void*) pMsg);
+ } else {
+ DBGPRINTF("WARN: submitMsg pQueue=NULL\n");
+ }

RETiRet;
}


but this is going to lost some messages or other side effects and at least it is
not logged without debug ;(. i dunno how to do it in the right way ...

thanks
bodik
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards


rgerhards at hq

Aug 1, 2012, 1:15 AM

Post #3 of 4 (189 views)
Permalink
Re: rsyslog 5.8.13 omgssapi shutdown segfault [In reply to]

> -----Original Message-----
> From: rsyslog-bounces [at] lists [mailto:rsyslog-
> bounces [at] lists] On Behalf Of bodik
> Sent: Thursday, July 12, 2012 3:24 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] rsyslog 5.8.13 omgssapi shutdown segfault
>
> On 07/12/12 11:47, bodik wrote:
> > Hello,
> >
> > it seems that during rsyslog shutdown, my main msg queue is terminated
> > before omgssapi.c/gss-misc.c which want's to LogError into main queue
> > during. omgssapi action uses DA queue, and this happens only if there
> > are some undelivered messages in the time of shutdown and remote
> server cannot be reached (tcpkill).
> >
> > does anyone have a clue where to look for ?
> >
> > a) is it better to chech some "terminate_imediatlety flag" in
> plugin:doTryResume ?
> > (but imho it is not done by any other module) and return with
> SUSPENDED or
> > something like that ?
>
> that doesn't help, i tried if(glbl.GetGlobalInputTermState()) but that leads to
> other troubles ...
>
> > or
> >
> > b) check a null pointer in qqueueEnqObj
>
> this works
>
> diff -rua a/tools/syslogd.c b/tools/syslogd.c
> --- a/tools/syslogd.c 2012-07-12 15:04:26.080620075 +0200
> +++ b/tools/syslogd.c 2012-07-12 15:03:41.772872922 +0200
> @@ -727,8 +727,12 @@
> pRuleset = MsgGetRuleset(pMsg);
>
> pQueue = (pRuleset == NULL) ? pMsgQueue :
> ruleset.GetRulesetQueue(pRuleset);
> - MsgPrepareEnqueue(pMsg);
> - qqueueEnqObj(pQueue, pMsg->flowCtlType, (void*) pMsg);
> + if( pQueue != NULL ) {
> + MsgPrepareEnqueue(pMsg);
> + qqueueEnqObj(pQueue, pMsg->flowCtlType, (void*) pMsg);
> + } else {
> + DBGPRINTF("WARN: submitMsg pQueue=NULL\n");
> + }
>
> RETiRet;
> }
>
>
> but this is going to lost some messages or other side effects and at least it is
> not logged without debug ;(. i dunno how to do it in the right way ...

I think the core idea is right. After all, we need to do some terminating steps and at some point messages may actually be lost. For inputs, you won't see that, because they are terminated. But e.g. for UDP messages that arrive after the input is shut down are lost as well.

I have crafted a somewhat different patch that has a broader reach - at least in theory, there could be some other code paths be affected as well. This will become available as part of 5.8.13 and all higher level releases released after 5.8.13.

Thanks for spotting this issue,
Rainer
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards


bodik at civ

Aug 8, 2012, 7:09 AM

Post #4 of 4 (179 views)
Permalink
Re: rsyslog 5.8.13 omgssapi shutdown segfault [In reply to]

> I have crafted a somewhat different patch that has a broader reach - at least in theory, there could be some other code paths be affected as well. This will become available as part of 5.8.13 and all higher level releases released after 5.8.13.


after extensive testing it works fine, thank you very much

bodik
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards

RSyslog users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.