Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

Zombie process when ptracing

 

 

Linux kernel RSS feed   Index | Next | Previous | View Threaded


npiggin at suse

Nov 19, 2009, 2:25 AM

Post #1 of 4 (157 views)
Permalink
Zombie process when ptracing

Hi,

Running recent git kernel, I have a process stuck in Z state

bash ? 0000000000000000 0 3188 3187 0x00000000
ffff88012e24fec8 0000000000000046 0000000000000000 0000000000000012
ffff88012e24fec8 ffff88012e24e000 ffff88012e24ffd8 ffff88012e24e000
000000000000efc8 ffff88012e24e000 ffff88012ea82090 ffff88012ff78640
Call Trace:
[<ffffffff8124baee>] ? proc_clear_tty+0x5e/0x70
[<ffffffff810587a8>] ? exit_ptrace+0xb8/0x140
[<ffffffff8105126a>] do_exit+0x58a/0x7c0
[<ffffffff810514dd>] do_group_exit+0x3d/0xb0
[<ffffffff81051562>] sys_exit_group+0x12/0x20
[<ffffffff8100b3eb>] system_call_fastpath+0x16/0x1b

This was after stracing a few test programs.

It also seems to have lost job control (^C) at the same time.

Hmm, and the kernel just paniced with an nmi lockup while I was
trying to get more info.

Call Trace:
<IRQ>
[<ffffffff811e5aa3>] __const_udelay+0x43/0x50
[<ffffffff810261bc>] arch_trigger_all_cpu_backtrace+0x4c/0x70
[<ffffffff81264f79>] sysrq_handle_showallcpus+0x9/0x10
[<ffffffff81264d10>] __handle_sysrq+0x120/0x180
[<ffffffff81264de6>] handle_sysrq+0x26/0x30
[<ffffffff81275af0>] serial8250_handle_port+0x210/0x2f0
[<ffffffff81275c58>] serial8250_interrupt+0x88/0x120
[<ffffffff810872e7>] handle_IRQ_event+0xa7/0x1e0
[<ffffffff810891cc>] handle_edge_irq+0xbc/0x150
[<ffffffff8100e2df>] handle_irq+0x1f/0x30
[<ffffffff8100d86a>] do_IRQ+0x6a/0xe0
[<ffffffff8100bc93>] ret_from_intr+0x0/0xa
<EOI>
[<ffffffff81013842>] ? default_idle+0xa2/0xc0
[<ffffffff8106f8d1>] ? __atomic_notifier_call_chain+0x31/0x60
[<ffffffff81013b1a>] ? c1e_idle+0x3a/0x100
[<ffffffff8106f911>] ? atomic_notifier_call_chain+0x11/0x20
[<ffffffff8100a66b>] ? cpu_idle+0x6b/0xc0
[<ffffffff81439dc6>] ? start_secondary+0x17c/0x1d6

I'll update this space if I can repeat it again. Any ideas?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


oleg at redhat

Nov 19, 2009, 5:29 PM

Post #2 of 4 (152 views)
Permalink
Re: Zombie process when ptracing [In reply to]

Hi,

On 11/19, Nick Piggin wrote:
>
> Running recent git kernel, I have a process stuck in Z state
>
> bash ? 0000000000000000 0 3188 3187 0x00000000
> ffff88012e24fec8 0000000000000046 0000000000000000 0000000000000012
> ffff88012e24fec8 ffff88012e24e000 ffff88012e24ffd8 ffff88012e24e000
> 000000000000efc8 ffff88012e24e000 ffff88012ea82090 ffff88012ff78640
> Call Trace:
> [<ffffffff8124baee>] ? proc_clear_tty+0x5e/0x70
> [<ffffffff810587a8>] ? exit_ptrace+0xb8/0x140
> [<ffffffff8105126a>] do_exit+0x58a/0x7c0
> [<ffffffff810514dd>] do_group_exit+0x3d/0xb0
> [<ffffffff81051562>] sys_exit_group+0x12/0x20
> [<ffffffff8100b3eb>] system_call_fastpath+0x16/0x1b
>
> This was after stracing a few test programs.
>
> It also seems to have lost job control (^C) at the same time.

This can happen if the tracer (strace) itself hangs, zombies
should go away once the tracer is killed. Or its ->real_parent
is stopped or hangs...

(I assume you didn't strace /sbin/init)

But,

> Hmm, and the kernel just paniced with an nmi lockup while I was
> trying to get more info.

this probably means we have a kernel bug ;)

If you see a zombie again, could you look at its /ptoc/pid/status?


And of course, which programs did you trace and how? It would be
great if we can reproduce the problem.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


npiggin at suse

Nov 23, 2009, 12:36 AM

Post #3 of 4 (139 views)
Permalink
Re: Zombie process when ptracing [In reply to]

On Fri, Nov 20, 2009 at 02:29:30AM +0100, Oleg Nesterov wrote:
> Hi,
>
> On 11/19, Nick Piggin wrote:
> >
> > Running recent git kernel, I have a process stuck in Z state
> >
> > bash ? 0000000000000000 0 3188 3187 0x00000000
> > ffff88012e24fec8 0000000000000046 0000000000000000 0000000000000012
> > ffff88012e24fec8 ffff88012e24e000 ffff88012e24ffd8 ffff88012e24e000
> > 000000000000efc8 ffff88012e24e000 ffff88012ea82090 ffff88012ff78640
> > Call Trace:
> > [<ffffffff8124baee>] ? proc_clear_tty+0x5e/0x70
> > [<ffffffff810587a8>] ? exit_ptrace+0xb8/0x140
> > [<ffffffff8105126a>] do_exit+0x58a/0x7c0
> > [<ffffffff810514dd>] do_group_exit+0x3d/0xb0
> > [<ffffffff81051562>] sys_exit_group+0x12/0x20
> > [<ffffffff8100b3eb>] system_call_fastpath+0x16/0x1b
> >
> > This was after stracing a few test programs.
> >
> > It also seems to have lost job control (^C) at the same time.
>
> This can happen if the tracer (strace) itself hangs, zombies
> should go away once the tracer is killed. Or its ->real_parent
> is stopped or hangs...
>
> (I assume you didn't strace /sbin/init)

No, I straced something else, and all straces seemed to be
killed but bash remained. I was running a script that in
turn launched another process, so I ran it via
strace -ff bash ./script.sh


> But,
>
> > Hmm, and the kernel just paniced with an nmi lockup while I was
> > trying to get more info.
>
> this probably means we have a kernel bug ;)

Hmm sorry that seemed like it _may_ have been an unrelated issue
(with the ssh connection).


> If you see a zombie again, could you look at its /ptoc/pid/status?

OK, any other hints if I see it again?


> And of course, which programs did you trace and how? It would be
> great if we can reproduce the problem.

At this stage I have not reproduced it, and I can't share the program
which was being straced. If it does happen again and I cannot distil
a simple test case, I will ask permission to distribute it.

Thanks,
Nick
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


oleg at redhat

Nov 23, 2009, 7:16 AM

Post #4 of 4 (138 views)
Permalink
Re: Zombie process when ptracing [In reply to]

On 11/23, Nick Piggin wrote:
>
> On Fri, Nov 20, 2009 at 02:29:30AM +0100, Oleg Nesterov wrote:
> > Hi,
> >
> > On 11/19, Nick Piggin wrote:
> > >
> > > Running recent git kernel, I have a process stuck in Z state
> > >
> > > bash ? 0000000000000000 0 3188 3187 0x00000000
> > > ffff88012e24fec8 0000000000000046 0000000000000000 0000000000000012
> > > ffff88012e24fec8 ffff88012e24e000 ffff88012e24ffd8 ffff88012e24e000
> > > 000000000000efc8 ffff88012e24e000 ffff88012ea82090 ffff88012ff78640
> > > Call Trace:
> > > [<ffffffff8124baee>] ? proc_clear_tty+0x5e/0x70
> > > [<ffffffff810587a8>] ? exit_ptrace+0xb8/0x140
> > > [<ffffffff8105126a>] do_exit+0x58a/0x7c0
> > > [<ffffffff810514dd>] do_group_exit+0x3d/0xb0
> > > [<ffffffff81051562>] sys_exit_group+0x12/0x20
> > > [<ffffffff8100b3eb>] system_call_fastpath+0x16/0x1b
> > >
> > > This was after stracing a few test programs.
> > >
> > > It also seems to have lost job control (^C) at the same time.
> >
> > This can happen if the tracer (strace) itself hangs, zombies
> > should go away once the tracer is killed. Or its ->real_parent
> > is stopped or hangs...
> >
> > (I assume you didn't strace /sbin/init)
>
> No, I straced something else, and all straces seemed to be
> killed but bash remained. I was running a script that in
> turn launched another process, so I ran it via
> strace -ff bash ./script.sh

OK, thanks.

Hmm. Just noticed the state above == '?'. Looks like sched_show_task()
is buggy, it should check ->exit_state for "ZX" from TASK_STATE_TO_CHAR_STR.
But this is off-topic.

> > If you see a zombie again, could you look at its /ptoc/pid/status?
>
> OK, any other hints if I see it again?

Well, also the contents of /proc/PPid/status and /proc/TracerPid/status
may help. And sysrq-t ouput. Otherwise, currently I have no idea where
to start.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.