Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: ivtv: devel

Re: Problems with Hauppauge HVR 1600 and cx18 driver

 

 

ivtv devel RSS feed   Index | Next | Previous | View Threaded


awalls at radix

Mar 28, 2009, 8:27 PM

Post #1 of 13 (5286 views)
Permalink
Re: Problems with Hauppauge HVR 1600 and cx18 driver

On Mon, 2009-03-23 at 06:52 -0700, Corey Taylor wrote:
> > Andy,
>
> > I am noticing an improvement in pixelation by setting the bufsize to
> > 64k. I will monitor over the next week and report back. I am running 3
> > HVR-1600s and the IRQs are coming up shared with the USB which also
> > supports my HD PVR capture device. Monday nights are usually one of
> > the busier nights for recording so I will know how well this holds up.
>
> > Thanks for the tip!
>
> > Brandon
>
> Hi Andy and Brandon, I too tried various different bufsizes as suggested and I still see very noticeable pixelation/tearing regardless of the setting.
>
> I even upgraded my motherboard this past weekend to an Asus AM2+ board with
> Phenon II X3 CPU. Still the same problems with the card in a brand new
> setup.
>
> I also tried modifying the cx18 source code as Andy suggested and that
> made more debug warning show up in my syslog, but still did not
> resolve the issue. Haven't tried this yet with the new motherboard
> though.
>
> Is it possible that this card is more sensitive to hiccups in the
> signal coming from the cable line? Or interference from other close-by
> cables and electronic equipment?
>
> When recording/watching Live TV through MythTV, I see that ffmpeg is
> constantly outputting various errors related to the video stream. I
> can post those here if you think it's relevant.
>
> Shoud I just return this card and get one with a different chipset? Or
> do you think driver updates can solve the issue?
>
> I'm happy to hold on to this card if it means I can contribute in some
> way to fixing the problem, if it's fixable : )

Corey and Brandon,

I found a race condition between the cx driver and the CX23418 firmware.
I have a patch that mitigates the problem here:

http://linuxtv.org/hg/~awalls/cx18/rev/9f5f44e0ce6c

I think the final form of the patch could be better. However, this
patch essentially eliminated any artifacts I was getting playing back
digital TV. I also had positive results running mplayer without the
"-cache" command line for both digital and analog captures.

I haven't tested on a single processor machine, nor in a multicard
setup, but things looked good enough that I thought it ready for test by
others.

Let me know if it helps or not.

Regards,
Andy


_______________________________________________
ivtv-devel mailing list
ivtv-devel [at] ivtvdriver
http://ivtvdriver.org/mailman/listinfo/ivtv-devel


xyzzy at speakeasy

Mar 29, 2009, 1:24 AM

Post #2 of 13 (5107 views)
Permalink
Re: Problems with Hauppauge HVR 1600 and cx18 driver [In reply to]

On Sat, 28 Mar 2009, Andy Walls wrote:
> On Mon, 2009-03-23 at 06:52 -0700, Corey Taylor wrote:
> I found a race condition between the cx driver and the CX23418 firmware.
> I have a patch that mitigates the problem here:
>
> http://linuxtv.org/hg/~awalls/cx18/rev/9f5f44e0ce6c

> [. We have to do this polling wait because there is a race with the
> firmware. Once we give it the SW1 interrupt above, it can wake up our
> waitq with an ack interrupt via the irq handler after we're ready to
> wait, but before we actually get put to sleep by schedule(). Loosing
> that race causes us to wait the entire timeout, waitng for a wakeup
> that's never going to come. ]

A race like this should be avoidable. The way it works is you do something
like this:

/* 1 */ set_current_state(TASK_INTERRUPTIBLE);
/* 2 */ cx18_write_reg_expect(cx, irq, SW1_INT_SET, irq, irq);
/* 3 */ schedule();
/* 4 */ ack_has_now_been_received();

The race you are talking about is when the ack arrives between line 2 and
3. If this happens here, the process' current state is changed to
TASK_RUNNING when the irq hander that receives the ack tries to wake our
process. If schedule() is called with the state set to TASK_RUNNING then
the process doesn't sleep. And thus there is no race. The key is that
preparing to sleep at line 1 happens before we start the event we want to
wait for at line 2.

wait_event() should take care of this. wait_event(q, test) basically does:

for(;;) {
// point A
add_me_to_waitqueue(q);
set_current_state(TASK_UNINTERRUPTIBLE);
if (test)
break;
// point B
schedule();
}
clean_up_wait_stuff();

If your event occurs and wake_up() is called at point A, then the test
should be true when it's checked and schedule() is never called. If the
event happens at point B, then the process' state will have been changed to
TASK_RUNNING by wake_up(), remember it's already on the waitqueue at this
point, and schedule() won't sleep.

I think what's probably happening is the test, cx18_readl(cx, &mb->ack) ==
cx18_readl(cx, &mb->request), is somehow not true even though the ack has
been received. Maybe a new request was added?

I think calling wait_event()'s with something that tests a hardware
register is a little iffy. It's better if the irq handler sets some driver
state flag (atomically!) that indicates the event you were waiting for has
happened and then you check that flag.

_______________________________________________
ivtv-devel mailing list
ivtv-devel [at] ivtvdriver
http://ivtvdriver.org/mailman/listinfo/ivtv-devel


jac1dlists at gmail

Mar 29, 2009, 5:30 AM

Post #3 of 13 (5127 views)
Permalink
Re: Problems with Hauppauge HVR 1600 and cx18 driver [In reply to]

Just to clarify, is this pixelation issue related to ATSC reception or
analog CATV input? I've not seen this on a dual tuner analog test.

-Jeff

On Sat, Mar 28, 2009 at 11:27 PM, Andy Walls <awalls [at] radix> wrote:

> On Mon, 2009-03-23 at 06:52 -0700, Corey Taylor wrote:
> > > Andy,
> >
> > > I am noticing an improvement in pixelation by setting the bufsize to
> > > 64k. I will monitor over the next week and report back. I am running 3
> > > HVR-1600s and the IRQs are coming up shared with the USB which also
> > > supports my HD PVR capture device. Monday nights are usually one of
> > > the busier nights for recording so I will know how well this holds up.
> >
> > > Thanks for the tip!
> >
> > > Brandon
> >
> > Hi Andy and Brandon, I too tried various different bufsizes as suggested
> and I still see very noticeable pixelation/tearing regardless of the
> setting.
> >
> > I even upgraded my motherboard this past weekend to an Asus AM2+ board
> with
> > Phenon II X3 CPU. Still the same problems with the card in a brand new
> > setup.
> >
> > I also tried modifying the cx18 source code as Andy suggested and that
> > made more debug warning show up in my syslog, but still did not
> > resolve the issue. Haven't tried this yet with the new motherboard
> > though.
> >
> > Is it possible that this card is more sensitive to hiccups in the
> > signal coming from the cable line? Or interference from other close-by
> > cables and electronic equipment?
> >
> > When recording/watching Live TV through MythTV, I see that ffmpeg is
> > constantly outputting various errors related to the video stream. I
> > can post those here if you think it's relevant.
> >
> > Shoud I just return this card and get one with a different chipset? Or
> > do you think driver updates can solve the issue?
> >
> > I'm happy to hold on to this card if it means I can contribute in some
> > way to fixing the problem, if it's fixable : )
>
> Corey and Brandon,
>
> I found a race condition between the cx driver and the CX23418 firmware.
> I have a patch that mitigates the problem here:
>
> http://linuxtv.org/hg/~awalls/cx18/rev/9f5f44e0ce6c<http://linuxtv.org/hg/%7Eawalls/cx18/rev/9f5f44e0ce6c>
>
> I think the final form of the patch could be better. However, this
> patch essentially eliminated any artifacts I was getting playing back
> digital TV. I also had positive results running mplayer without the
> "-cache" command line for both digital and analog captures.
>
> I haven't tested on a single processor machine, nor in a multicard
> setup, but things looked good enough that I thought it ready for test by
> others.
>
> Let me know if it helps or not.
>
> Regards,
> Andy
>
>
> _______________________________________________
> ivtv-devel mailing list
> ivtv-devel [at] ivtvdriver
> http://ivtvdriver.org/mailman/listinfo/ivtv-devel
>


bcjenkins at tvwhere

Mar 29, 2009, 7:25 AM

Post #4 of 13 (5113 views)
Permalink
Re: Problems with Hauppauge HVR 1600 and cx18 driver [In reply to]

On Sat, Mar 28, 2009 at 11:27 PM, Andy Walls <awalls [at] radix> wrote:
> On Mon, 2009-03-23 at 06:52 -0700, Corey Taylor wrote:
>> > Andy,
>>
>> > I am noticing an improvement in pixelation by setting the bufsize to
>> > 64k. I will monitor over the next week and report back. I am running 3
>> > HVR-1600s and the IRQs are coming up shared with the USB which also
>> > supports my HD PVR capture device. Monday nights are usually one of
>> > the busier nights for recording so I will know how well this holds up.
>>
>> > Thanks for the tip!
>>
>> > Brandon
>>
>> Hi Andy and Brandon, I too tried various different bufsizes as suggested and I still see very noticeable pixelation/tearing regardless of the setting.
>>
>> I even upgraded my motherboard this past weekend to an Asus AM2+ board with
>> Phenon II X3 CPU. Still the same problems with the card in a brand new
>> setup.
>>
>> I also tried modifying the cx18 source code as Andy suggested and that
>> made more debug warning show up in my syslog, but still did not
>> resolve the issue. Haven't tried this yet with the new motherboard
>> though.
>>
>> Is it possible that this card is more sensitive to hiccups in the
>> signal coming from the cable line? Or interference from other close-by
>> cables and electronic equipment?
>>
>> When recording/watching Live TV through MythTV, I see that ffmpeg is
>> constantly outputting various errors related to the video stream. I
>> can post those here if you think it's relevant.
>>
>> Shoud I just return this card and get one with a different chipset? Or
>> do you think driver updates can solve the issue?
>>
>> I'm happy to hold on to this card if it means I can contribute in some
>> way to fixing the problem, if it's fixable : )
>
> Corey and Brandon,
>
> I found a race condition between the cx driver and the CX23418 firmware.
> I have a patch that mitigates the problem here:
>
> http://linuxtv.org/hg/~awalls/cx18/rev/9f5f44e0ce6c
>
> I think the final form of the patch could be better.  However, this
> patch essentially eliminated any artifacts I was getting playing back
> digital TV.  I also had positive results running mplayer without the
> "-cache" command line for both digital and analog captures.
>
> I haven't tested on a single processor machine, nor in a multicard
> setup, but things looked good enough that I thought it ready for test by
> others.
>
> Let me know if it helps or not.
>
> Regards,
> Andy
>
>
Hi Andy,

I have cloned this tree and loaded on the server. I'll let you know
over the next couple of days if there is any improvement.

Thanks!

Brandon

_______________________________________________
ivtv-devel mailing list
ivtv-devel [at] ivtvdriver
http://ivtvdriver.org/mailman/listinfo/ivtv-devel


awalls at radix

Mar 29, 2009, 10:56 AM

Post #5 of 13 (5107 views)
Permalink
Re: Problems with Hauppauge HVR 1600 and cx18 driver [In reply to]

On Sun, 2009-03-29 at 01:24 -0700, Trent Piepho wrote:
> On Sat, 28 Mar 2009, Andy Walls wrote:
> > On Mon, 2009-03-23 at 06:52 -0700, Corey Taylor wrote:
> > I found a race condition between the cx driver and the CX23418 firmware.
> > I have a patch that mitigates the problem here:
> >
> > http://linuxtv.org/hg/~awalls/cx18/rev/9f5f44e0ce6c
>
> > [. We have to do this polling wait because there is a race with the
> > firmware. Once we give it the SW1 interrupt above, it can wake up our
> > waitq with an ack interrupt via the irq handler after we're ready to
> > wait, but before we actually get put to sleep by schedule(). Loosing
> > that race causes us to wait the entire timeout, waitng for a wakeup
> > that's never going to come. ]

Trent,

First, thanks for the fresh perspective.


> A race like this should be avoidable. The way it works is you do something
> like this:
>
> /* 1 */ set_current_state(TASK_INTERRUPTIBLE);
> /* 2 */ cx18_write_reg_expect(cx, irq, SW1_INT_SET, irq, irq);
> /* 3 */ schedule();
> /* 4 */ ack_has_now_been_received();

I tried something like this in my second iteration, see below. (The
patch I put in my repo was actually my third iteration.)


> The race you are talking about is when the ack arrives between line 2 and
> 3. If this happens here, the process' current state is changed to
> TASK_RUNNING when the irq hander that receives the ack tries to wake our
> process. If schedule() is called with the state set to TASK_RUNNING then
> the process doesn't sleep. And thus there is no race. The key is that
> preparing to sleep at line 1 happens before we start the event we want to
> wait for at line 2.
>
> wait_event() should take care of this. wait_event(q, test) basically does:
>
> for(;;) {
> // point A
> add_me_to_waitqueue(q);
> set_current_state(TASK_UNINTERRUPTIBLE);
> if (test)
> break;
> // point B
> schedule();
> }
> clean_up_wait_stuff();

As you know, the condition is checked even before this loop is entered,
to avoid even being even added to a waitqueue. (Thank God for ctags...)

As you may have noticed, the original code was using
wait_event_timeout() before like this:

CX18_DEBUG_HI_IRQ("sending interrupt SW1: %x to send %s\n",
irq, info->name);
cx18_write_reg_expect(cx, irq, SW1_INT_SET, irq, irq);

ret = wait_event_timeout(
*waitq,
cx18_readl(cx, &mb->ack) == cx18_readl(cx, &mb->request),
timeout);

Because waiting for the ack back is the right thing to do, but certainly
waiting too long is not warranted.

This gave me the occasional log message like this:

1: cx18-0: irq: sending interrupt SW1: 8 to send CX18_CPU_DE_SET_MDL
2: cx18-0: irq: received interrupts SW1: 0 SW2: 8 HW2: 0
3: cx18-0: irq: received interrupts SW1: 10000 SW2: 0 HW2: 0
4: cx18-0: warning: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for RPU acknowledgement

Where line 1 is the driver notifiying the firmware with a SW1 interrupt.
Line 2 is the firmware responding back to the cx18_irq_handler() with
the Ack interrupt in SW2 (the flags match, 8 & 8, by design).
Line 3 is an unrelated incoming video buffer notification for the cx18
driver.
Line 4 is the wait_event_timeout() timing out.

Since, I'm sending buffers back to the firmware on the read()-ing
applications timeline, these delays caused playback problems.


> If your event occurs and wake_up() is called at point A, then the test
> should be true when it's checked and schedule() is never called. If the
> event happens at point B, then the process' state will have been changed to
> TASK_RUNNING by wake_up(), remember it's already on the waitqueue at this
> point, and schedule() won't sleep.

OK, for some reason, I thought schedule() and schedule_timeout() would
go to sleep anyway.


> I think what's probably happening is the test, cx18_readl(cx, &mb->ack) ==
> cx18_readl(cx, &mb->request), is somehow not true even though the ack has
> been received.

A PCI bus read error could be the culprit here. That's the only thing I
can think of. We only get one notification via IRQ from the firmware.


> Maybe a new request was added?

No, I lock the respective epu2apu or epu2cpu mailboxes respectively with
a mutex.


> I think calling wait_event()'s with something that tests a hardware
> register is a little iffy. It's better if the irq handler sets some driver
> state flag (atomically!) that indicates the event you were waiting for has
> happened and then you check that flag.

I was toying with setting an atomic while in the IRQ handler. But then
I realized when we get the ack interrupt, the firmware should actually
be done. So really the wakeup() is the only indicator I really need.
Checking for ack == req is just a formality I guess.


There wasn't a wait_timeout(), so I had tried something like this in my
first iteration:

#define wait_event_oneshot_timeout(wq, condition, timeout) \
({ \
long __ret = timeout; \
if (!(condition)) { \
DEFINE_WAIT(__wait); \
prepare_to_wait(&wq, &__wait, TASK_UNINTERRUPTIBLE); \
if (!(condition)) { \
__ret = schedule_timeout(__ret); \
} \
finish_wait(&wq, &__wait); \
} \
__ret; \
})

...
cx18_write_reg_expect(cx, irq, SW1_INT_SET, irq, irq);

ret = wait_event_oneshot_timeout(*waitq,
cx18_readl(cx, &mb->request) ==
cx18_readl(cx, &mb->ack),
timeout);
...


It didn't work. Sometimes it would wait the whole timeout, but the
cx18_irq_handler() had gotten an ack interrupt.


Then I tried:


// FIXME break into several small timeouts/poll
// or use an atomic to communicate completion
CX18_DEBUG_HI_IRQ("sending interrupt SW1: %x to send %s\n",
irq, info->name);
ret = timeout;
prepare_to_wait(waitq, &w, TASK_UNINTERRUPTIBLE);
cx18_write_reg_expect(cx, irq, SW1_INT_SET, irq, irq);
/*
* Will we schedule in time, before the IRQ handler wakes up our waitq?
* Who knows?! How exciting! Let the race begin!
*/
if (req != cx18_readl(cx, &mb->ack))
ret = schedule_timeout(timeout);
finish_wait(waitq, &w);


It didn't work, sometimes it would wait the whole timeout even though
the ack interrupt had arrived. Again at the time, I was under the
impression that schedule_timeout() would go to sleep anyway even if we
had been awakened (thus my sarcastic comments).


Did I miss anything with either of those two previous tries?


I guess I need to dig into the guts of schedule_timeout() to convince
myself that the process won't be put to sleep.


I'm using Fedora 10 BTW:

Linux palomino.walls.org 2.6.27.9-159.fc10.x86_64 #1 SMP Tue Dec 16
14:47:52 EST 2008 x86_64 x86_64 x86_64 GNU/Linux

Thanks.

Regards,
Andy


_______________________________________________
ivtv-devel mailing list
ivtv-devel [at] ivtvdriver
http://ivtvdriver.org/mailman/listinfo/ivtv-devel


awalls at radix

Mar 29, 2009, 11:09 AM

Post #6 of 13 (5100 views)
Permalink
Re: Problems with Hauppauge HVR 1600 and cx18 driver [In reply to]

On Sun, 2009-03-29 at 08:30 -0400, Jeff Campbell wrote:
> Just to clarify, is this pixelation issue related to ATSC reception or
> analog CATV input? I've not seen this on a dual tuner analog test.
>
> -Jeff

Corey complained of very noticable artifacts with DTV capture (I Think
he had cable).

I noticed minor tears now and then, in the ATSC OTA capture playback.
They almost directly correlated to the full 10 msec timeout on waiting
for a command ack from the firmware. (The subject of my recent
experimental patch.)

I render to a display on the same machine that is performing the capture
and recording. I also had to disable Xv rendering on my machine, which
chews up a little more CPU. IIRC you don't render on the same machine
on which the capture is occurring. So maybe that's why you don't see
it?

Regards,
Andy

> On Sat, Mar 28, 2009 at 11:27 PM, Andy Walls <awalls [at] radix> wrote:
> On Mon, 2009-03-23 at 06:52 -0700, Corey Taylor wrote:
> > > Andy,
> >
> > > I am noticing an improvement in pixelation by setting the
> bufsize to
> > > 64k. I will monitor over the next week and report back. I
> am running 3
> > > HVR-1600s and the IRQs are coming up shared with the USB
> which also
> > > supports my HD PVR capture device. Monday nights are
> usually one of
> > > the busier nights for recording so I will know how well
> this holds up.
> >
> > > Thanks for the tip!
> >
> > > Brandon
> >
> > Hi Andy and Brandon, I too tried various different bufsizes
> as suggested and I still see very noticeable
> pixelation/tearing regardless of the setting.
> >
> > I even upgraded my motherboard this past weekend to an Asus
> AM2+ board with
> > Phenon II X3 CPU. Still the same problems with the card in a
> brand new
> > setup.
> >
> > I also tried modifying the cx18 source code as Andy
> suggested and that
> > made more debug warning show up in my syslog, but still did
> not
> > resolve the issue. Haven't tried this yet with the new
> motherboard
> > though.
> >
> > Is it possible that this card is more sensitive to hiccups
> in the
> > signal coming from the cable line? Or interference from
> other close-by
> > cables and electronic equipment?
> >
> > When recording/watching Live TV through MythTV, I see that
> ffmpeg is
> > constantly outputting various errors related to the video
> stream. I
> > can post those here if you think it's relevant.
> >
> > Shoud I just return this card and get one with a different
> chipset? Or
> > do you think driver updates can solve the issue?
> >
> > I'm happy to hold on to this card if it means I can
> contribute in some
> > way to fixing the problem, if it's fixable : )
>
> Corey and Brandon,
>
> I found a race condition between the cx driver and the CX23418
> firmware.
> I have a patch that mitigates the problem here:
>
> http://linuxtv.org/hg/~awalls/cx18/rev/9f5f44e0ce6c
>
> I think the final form of the patch could be better. However,
> this
> patch essentially eliminated any artifacts I was getting
> playing back
> digital TV. I also had positive results running mplayer
> without the
> "-cache" command line for both digital and analog captures.
>
> I haven't tested on a single processor machine, nor in a
> multicard
> setup, but things looked good enough that I thought it ready
> for test by
> others.
>
> Let me know if it helps or not.
>
> Regards,
> Andy
>
>
> _______________________________________________
> ivtv-devel mailing list
> ivtv-devel [at] ivtvdriver
> http://ivtvdriver.org/mailman/listinfo/ivtv-devel
>
> _______________________________________________
> ivtv-devel mailing list
> ivtv-devel [at] ivtvdriver
> http://ivtvdriver.org/mailman/listinfo/ivtv-devel


_______________________________________________
ivtv-devel mailing list
ivtv-devel [at] ivtvdriver
http://ivtvdriver.org/mailman/listinfo/ivtv-devel


xyzzy at speakeasy

Mar 30, 2009, 1:54 PM

Post #7 of 13 (5074 views)
Permalink
Re: Problems with Hauppauge HVR 1600 and cx18 driver [In reply to]

On Sun, 29 Mar 2009, Andy Walls wrote:
> On Sun, 2009-03-29 at 01:24 -0700, Trent Piepho wrote:
> > wait_event() should take care of this. wait_event(q, test) basically does:
> >
> > for(;;) {
> > // point A
> > add_me_to_waitqueue(q);
> > set_current_state(TASK_UNINTERRUPTIBLE);
> > if (test)
> > break;
> > // point B
> > schedule();
> > }
> > clean_up_wait_stuff();
>
> As you know, the condition is checked even before this loop is entered,
> to avoid even being even added to a waitqueue. (Thank God for ctags...)

I think the initial check of the condition is just an optimization and
everything will still work without it. Seeing as all this is inlined, I
wonder if it's a good optimization...

> As you may have noticed, the original code was using
> wait_event_timeout() before like this:
>
> CX18_DEBUG_HI_IRQ("sending interrupt SW1: %x to send %s\n",
> irq, info->name);
> cx18_write_reg_expect(cx, irq, SW1_INT_SET, irq, irq);
>
> ret = wait_event_timeout(
> *waitq,
> cx18_readl(cx, &mb->ack) == cx18_readl(cx, &mb->request),
> timeout);
>
> Because waiting for the ack back is the right thing to do, but certainly
> waiting too long is not warranted.
>
> This gave me the occasional log message like this:
>
> 1: cx18-0: irq: sending interrupt SW1: 8 to send CX18_CPU_DE_SET_MDL
> 2: cx18-0: irq: received interrupts SW1: 0 SW2: 8 HW2: 0
> 3: cx18-0: irq: received interrupts SW1: 10000 SW2: 0 HW2: 0
> 4: cx18-0: warning: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for RPU acknowledgement
>
> Where line 1 is the driver notifiying the firmware with a SW1 interrupt.
> Line 2 is the firmware responding back to the cx18_irq_handler() with
> the Ack interrupt in SW2 (the flags match, 8 & 8, by design).
> Line 3 is an unrelated incoming video buffer notification for the cx18
> driver.
> Line 4 is the wait_event_timeout() timing out.

Could it be that the wait_event doesn't actually run and check its
condition until _after_ line 3? In that case SW2 != 8 and so it goes back
to sleep? Calling wake_up() just makes the processes on the waitq
runnable, they don't actually run until later, possibly much later.

> > If your event occurs and wake_up() is called at point A, then the test
> > should be true when it's checked and schedule() is never called. If the
> > event happens at point B, then the process' state will have been changed to
> > TASK_RUNNING by wake_up(), remember it's already on the waitqueue at this
> > point, and schedule() won't sleep.
>
> OK, for some reason, I thought schedule() and schedule_timeout() would
> go to sleep anyway.

AFAIK, they'll still cause the kernel schedule a process. Maybe a
different process. But the original process is still in TASK_RUNNING state
and so still in the run queue and will get run again. If it was in
TASK_(UN)INTERRUPTIBLE state then it wouldn't be in the run queue and
wouldn't run again until something woke it up.

> > I think what's probably happening is the test, cx18_readl(cx, &mb->ack) ==
> > cx18_readl(cx, &mb->request), is somehow not true even though the ack has
> > been received.
>
> A PCI bus read error could be the culprit here. That's the only thing I
> can think of. We only get one notification via IRQ from the firmware.
>
>
> > Maybe a new request was added?
>
> No, I lock the respective epu2apu or epu2cpu mailboxes respectively with
> a mutex.

But in your log:
> 1: cx18-0: irq: sending interrupt SW1: 8 to send CX18_CPU_DE_SET_MDL
> 2: cx18-0: irq: received interrupts SW1: 0 SW2: 8 HW2: 0
> 3: cx18-0: irq: received interrupts SW1: 10000 SW2: 0 HW2: 0
> 4: cx18-0: warning: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for RPU acknowledgement

Isn't the wait_event_timeout() waiting until line 4? And doesn't line 3
mean something has changed the registers? Changed them before the
wait_event finished?

> > I think calling wait_event()'s with something that tests a hardware
> > register is a little iffy. It's better if the irq handler sets some driver
> > state flag (atomically!) that indicates the event you were waiting for has
> > happened and then you check that flag.
>
> I was toying with setting an atomic while in the IRQ handler. But then
> I realized when we get the ack interrupt, the firmware should actually
> be done. So really the wakeup() is the only indicator I really need.
> Checking for ack == req is just a formality I guess.

If you use an interruptible timeout, then you could get interrupted with a
signal before the irq handler has woken you.

> There wasn't a wait_timeout(), so I had tried something like this in my
> first iteration:

It's called sleep_on_timeout(q, timeout).

_______________________________________________
ivtv-devel mailing list
ivtv-devel [at] ivtvdriver
http://ivtvdriver.org/mailman/listinfo/ivtv-devel


awalls at radix

Mar 30, 2009, 4:35 PM

Post #8 of 13 (5072 views)
Permalink
Re: Problems with Hauppauge HVR 1600 and cx18 driver [In reply to]

On Mon, 2009-03-30 at 13:54 -0700, Trent Piepho wrote:
> On Sun, 29 Mar 2009, Andy Walls wrote:
> > On Sun, 2009-03-29 at 01:24 -0700, Trent Piepho wrote:
> > > wait_event() should take care of this. wait_event(q, test) basically does:
> > >
> > > for(;;) {
> > > // point A
> > > add_me_to_waitqueue(q);
> > > set_current_state(TASK_UNINTERRUPTIBLE);
> > > if (test)
> > > break;
> > > // point B
> > > schedule();
> > > }
> > > clean_up_wait_stuff();
> >
> > As you know, the condition is checked even before this loop is entered,
> > to avoid even being even added to a waitqueue. (Thank God for ctags...)
>
> I think the initial check of the condition is just an optimization and
> everything will still work without it. Seeing as all this is inlined, I
> wonder if it's a good optimization...

I guess it depends on how expensive prepare_to_wait() is since it
acquires a spinlock.

But now that you mention inlining. I now have to check if this sequence
that I thought didn't work:

prepare_to_wait(waitq, &w, TASK_UNINTERRUPTIBLE);
cx18_write_reg_expect(cx, irq, SW1_INT_SET, irq, irq);
if (req != cx18_readl(cx, &mb->ack))
ret = schedule_timeout(timeout);
finish_wait(waitq, &w);

is actually getting reordered. I bet constituent parts of the first two
lines may be. That would certainly explain things.

> > As you may have noticed, the original code was using
> > wait_event_timeout() before like this:
> >
> > CX18_DEBUG_HI_IRQ("sending interrupt SW1: %x to send %s\n",
> > irq, info->name);
> > cx18_write_reg_expect(cx, irq, SW1_INT_SET, irq, irq);
> >
> > ret = wait_event_timeout(
> > *waitq,
> > cx18_readl(cx, &mb->ack) == cx18_readl(cx, &mb->request),
> > timeout);
> >
> > Because waiting for the ack back is the right thing to do, but certainly
> > waiting too long is not warranted.
> >
> > This gave me the occasional log message like this:
> >
> > 1: cx18-0: irq: sending interrupt SW1: 8 to send CX18_CPU_DE_SET_MDL
> > 2: cx18-0: irq: received interrupts SW1: 0 SW2: 8 HW2: 0
> > 3: cx18-0: irq: received interrupts SW1: 10000 SW2: 0 HW2: 0
> > 4: cx18-0: warning: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for RPU acknowledgement
> >
> > Where line 1 is the driver notifiying the firmware with a SW1 interrupt.
> > Line 2 is the firmware responding back to the cx18_irq_handler() with
> > the Ack interrupt in SW2 (the flags match, 8 & 8, by design).
> > Line 3 is an unrelated incoming video buffer notification for the cx18
> > driver.
> > Line 4 is the wait_event_timeout() timing out.
>
> Could it be that the wait_event doesn't actually run and check its
> condition until _after_ line 3?

SW2: 8 is a firmware response to us setting the outgoing SW1 to 8.

SW2 getting cleared is done by cx18_irq_handler and its helper:

static void xpu_ack(struct cx18 *cx, u32 sw2)
{
// Wake up the process waiting on the EPU -> CPU mailbox ack
// This one has a flag value of 8
if (sw2 & IRQ_CPU_TO_EPU_ACK)
wake_up(&cx->mb_cpu_waitq);
// Wake up the process waiting on the EPU -> APU mailbox ack
if (sw2 & IRQ_APU_TO_EPU_ACK)
wake_up(&cx->mb_apu_waitq);
}

irqreturn_t cx18_irq_handler(int irq, void *dev_id)
{
struct cx18 *cx = (struct cx18 *)dev_id;
u32 sw1, sw2, hw2;

// Get the status of SW2
sw2 = cx18_read_reg(cx, SW2_INT_STATUS) & cx->sw2_irq_mask;

// Clear any status of SW2 that were found set
if (sw2)
cx18_write_reg_expect(cx, sw2, SW2_INT_STATUS, ~sw2, sw2);

// Act on any SW2 interrupts found set
if (sw2)
xpu_ack(cx, sw2);

return (sw1 || sw2 || hw2) ? IRQ_HANDLED : IRQ_NONE;
}



> In that case SW2 != 8 and so it goes back
> to sleep?

If SW2 got set back to 0 (or != 8), that means we cleared it ourselves.
We only do that slightly prior to waking up the proper waitq.


> Calling wake_up() just makes the processes on the waitq
> runnable, they don't actually run until later, possibly much later.

Hmm. Maybe then we're yielding the processor and then simply running
much later.

If that's the case, I may be stuck with shorter timeouts with polls. :(

> > > If your event occurs and wake_up() is called at point A, then the test
> > > should be true when it's checked and schedule() is never called. If the
> > > event happens at point B, then the process' state will have been changed to
> > > TASK_RUNNING by wake_up(), remember it's already on the waitqueue at this
> > > point, and schedule() won't sleep.
> >
> > OK, for some reason, I thought schedule() and schedule_timeout() would
> > go to sleep anyway.
>
> AFAIK, they'll still cause the kernel schedule a process. Maybe a
> different process. But the original process is still in TASK_RUNNING state
> and so still in the run queue and will get run again. If it was in
> TASK_(UN)INTERRUPTIBLE state then it wouldn't be in the run queue and
> wouldn't run again until something woke it up.

I'm going to have to figure out how to profile things.


> > > I think what's probably happening is the test, cx18_readl(cx, &mb->ack) ==
> > > cx18_readl(cx, &mb->request), is somehow not true even though the ack has
> > > been received.
> >
> > A PCI bus read error could be the culprit here. That's the only thing I
> > can think of. We only get one notification via IRQ from the firmware.
> >
> >
> > > Maybe a new request was added?
> >
> > No, I lock the respective epu2apu or epu2cpu mailboxes respectively with
> > a mutex.
>
> But in your log:
> > 1: cx18-0: irq: sending interrupt SW1: 8 to send CX18_CPU_DE_SET_MDL
> > 2: cx18-0: irq: received interrupts SW1: 0 SW2: 8 HW2: 0
> > 3: cx18-0: irq: received interrupts SW1: 10000 SW2: 0 HW2: 0
> > 4: cx18-0: warning: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for RPU acknowledgement
>
> Isn't the wait_event_timeout() waiting until line 4? And doesn't line 3
> mean something has changed the registers? Changed them before the
> wait_event finished?

First a note:

There are four mailboxes the driver cares about EPU -> CPU, CPU -> EPU,
EPU -> APU, and APU -> EPU. We mostly care about the EPU/CPU
mailboxes.

Line 1 indicates an outgoing command in the EPU -> CPU mailbox
Line 2 indicates an ack was placed in the EPU -> CPU mailbox
Line 3 indicates an incoming command in the CPU -> EPU mailbox
Line 4 indicates the async notification in Line 2 being missed by
us somehow

So

yes, wait_event_timeout() is waiting until line 4.
yes, line 3 means we cleared the SW2 register when we got line 2,
yes, line 3 also means the firmware is sending us a full capture buffer
with information in the CPU -> EPU mailbox (not the mbox at issue)
But, line 3 does not mean the EPU -> CPU mailbox ack changed, Line 2
tells us that should be the case (different mailboxes)


The process of the IRQ handler clearing SW2 should have resulted in the
IRQ handler also sending a wake_up(). It doesn't appear to happen a
small percentage of the time.


> > > I think calling wait_event()'s with something that tests a hardware
> > > register is a little iffy. It's better if the irq handler sets some driver
> > > state flag (atomically!) that indicates the event you were waiting for has
> > > happened and then you check that flag.
> >
> > I was toying with setting an atomic while in the IRQ handler. But then
> > I realized when we get the ack interrupt, the firmware should actually
> > be done. So really the wakeup() is the only indicator I really need.
> > Checking for ack == req is just a formality I guess.
>
> If you use an interruptible timeout, then you could get interrupted with a
> signal before the irq handler has woken you.

I used UNITERRUPTIBLE here because it doesn't quite make sense to
inidcate -ERESTARTSYS back to the caller in all circumstances. This
function can be called by read() to send a series of commands to start a
capture or to just send back a drained buffer; poll() to send a series
of commands to start a capture; or by the work handler to return a
drained buffer.


> > There wasn't a wait_timeout(), so I had tried something like this in my
> > first iteration:
>
> It's called sleep_on_timeout(q, timeout).

I was deterred by the big scary comment:

/*
* These are the old interfaces to sleep waiting for an event.
* They are racy. DO NOT use them, use the wait_event* interfaces above.
* We plan to remove these interfaces.
*/
extern void sleep_on(wait_queue_head_t *q);
extern long sleep_on_timeout(wait_queue_head_t *q,
signed long timeout);



Again, thanks for helping me think this through. It's obvious I have
more testing and troubleshooting to do.

My first step will be to check to see if compiler reordering screwed up
one of my prior patch attempts.

Regards,
Andy


_______________________________________________
ivtv-devel mailing list
ivtv-devel [at] ivtvdriver
http://ivtvdriver.org/mailman/listinfo/ivtv-devel


bcjenkins at tvwhere

Mar 31, 2009, 3:02 AM

Post #9 of 13 (5050 views)
Permalink
Re: Problems with Hauppauge HVR 1600 and cx18 driver [In reply to]

On Sun, Mar 29, 2009 at 10:25 AM, Brandon Jenkins <bcjenkins [at] tvwhere> wrote:
> On Sat, Mar 28, 2009 at 11:27 PM, Andy Walls <awalls [at] radix> wrote:
>> On Mon, 2009-03-23 at 06:52 -0700, Corey Taylor wrote:
>>> > Andy,
>>>
>>> > I am noticing an improvement in pixelation by setting the bufsize to
>>> > 64k. I will monitor over the next week and report back. I am running 3
>>> > HVR-1600s and the IRQs are coming up shared with the USB which also
>>> > supports my HD PVR capture device. Monday nights are usually one of
>>> > the busier nights for recording so I will know how well this holds up.
>>>
>>> > Thanks for the tip!
>>>
>>> > Brandon
>>>
>>> Hi Andy and Brandon, I too tried various different bufsizes as suggested and I still see very noticeable pixelation/tearing regardless of the setting.
>>>
>>> I even upgraded my motherboard this past weekend to an Asus AM2+ board with
>>> Phenon II X3 CPU. Still the same problems with the card in a brand new
>>> setup.
>>>
>>> I also tried modifying the cx18 source code as Andy suggested and that
>>> made more debug warning show up in my syslog, but still did not
>>> resolve the issue. Haven't tried this yet with the new motherboard
>>> though.
>>>
>>> Is it possible that this card is more sensitive to hiccups in the
>>> signal coming from the cable line? Or interference from other close-by
>>> cables and electronic equipment?
>>>
>>> When recording/watching Live TV through MythTV, I see that ffmpeg is
>>> constantly outputting various errors related to the video stream. I
>>> can post those here if you think it's relevant.
>>>
>>> Shoud I just return this card and get one with a different chipset? Or
>>> do you think driver updates can solve the issue?
>>>
>>> I'm happy to hold on to this card if it means I can contribute in some
>>> way to fixing the problem, if it's fixable : )
>>
>> Corey and Brandon,
>>
>> I found a race condition between the cx driver and the CX23418 firmware.
>> I have a patch that mitigates the problem here:
>>
>> http://linuxtv.org/hg/~awalls/cx18/rev/9f5f44e0ce6c
>>
>> I think the final form of the patch could be better.  However, this
>> patch essentially eliminated any artifacts I was getting playing back
>> digital TV.  I also had positive results running mplayer without the
>> "-cache" command line for both digital and analog captures.
>>
>> I haven't tested on a single processor machine, nor in a multicard
>> setup, but things looked good enough that I thought it ready for test by
>> others.
>>
>> Let me know if it helps or not.
>>
>> Regards,
>> Andy
>>
>>
> Hi Andy,
>
> I have cloned this tree and loaded on the server. I'll let you know
> over the next couple of days if there is any improvement.
>
> Thanks!
>
> Brandon
>

Andy,

Based on continued discussions it seems you're still exploring things.
I can tell you that the analog captures are still exhibiting
artifacts. I'll get to some of the HD captures tonight.

Brandon

_______________________________________________
ivtv-devel mailing list
ivtv-devel [at] ivtvdriver
http://ivtvdriver.org/mailman/listinfo/ivtv-devel


awalls at radix

Mar 31, 2009, 3:44 AM

Post #10 of 13 (5055 views)
Permalink
Re: Problems with Hauppauge HVR 1600 and cx18 driver [In reply to]

On Tue, 2009-03-31 at 06:02 -0400, Brandon Jenkins wrote:
> On Sun, Mar 29, 2009 at 10:25 AM, Brandon Jenkins <bcjenkins [at] tvwhere> wrote:
> > On Sat, Mar 28, 2009 at 11:27 PM, Andy Walls <awalls [at] radix> wrote:
> >> On Mon, 2009-03-23 at 06:52 -0700, Corey Taylor wrote:

> >>
> > Hi Andy,
> >
> > I have cloned this tree and loaded on the server. I'll let you know
> > over the next couple of days if there is any improvement.
> >
> > Thanks!
> >
> > Brandon
> >
>
> Andy,
>
> Based on continued discussions it seems you're still exploring things.
> I can tell you that the analog captures are still exhibiting
> artifacts.

Well, the patch you have to work with essentially does a poll every 1/4
of a millisecond (a rate of 67 times per NTSC field while polling) with
a wait between the polls. If you're still getting artifacts, then I
suspect your artifacts problems may not lie where I'm looking right now.

I have no artifacts with analog captures, so let me get through this
first problem and then I'll ask more about your symptoms. I don't have
a 3 HVR-1600 card setup. You may try setting enc_mpg_bufsize to a value
different than 32 kB. Setting it higher will make you less likely to
lose buffers due to the firmware timing out when it sends buffers to the
driver. Setting it lower will make the loss of any one buffer of less
impact to the stream, and setting the analog capture to provide TS vs
the deault of a PS may help too.

> I'll get to some of the HD captures tonight.

OK.

Regards,
Andy

> Brandon



_______________________________________________
ivtv-devel mailing list
ivtv-devel [at] ivtvdriver
http://ivtvdriver.org/mailman/listinfo/ivtv-devel


awalls at radix

Apr 13, 2009, 8:20 PM

Post #11 of 13 (4734 views)
Permalink
Re: Problems with Hauppauge HVR 1600 and cx18 driver [In reply to]

On Tue, 2009-03-31 at 06:02 -0400, Brandon Jenkins wrote:
> >>
> >> Corey and Brandon,
> >>
> >> I found a race condition between the cx driver and the CX23418 firmware.
> >> I have a patch that mitigates the problem here:
> >>
> >> http://linuxtv.org/hg/~awalls/cx18/rev/9f5f44e0ce6c
> >>
> >> I think the final form of the patch could be better. However, this
> >> patch essentially eliminated any artifacts I was getting playing back
> >> digital TV. I also had positive results running mplayer without the
> >> "-cache" command line for both digital and analog captures.
> >>
> >> I haven't tested on a single processor machine, nor in a multicard
> >> setup, but things looked good enough that I thought it ready for test by
> >> others.
> >>
> >> Let me know if it helps or not.
> >>
> >> Regards,
> >> Andy
> >>

> Andy,
>
> Based on continued discussions it seems you're still exploring things.
> I can tell you that the analog captures are still exhibiting
> artifacts. I'll get to some of the HD captures tonight.
>
> Brandon

Brandon and Corey,

I have a series of changes to improve performance of the cx18 driver in
delivering incoming buffers to applications. Please test the code here
if you'd like:

http://linuxtv.org/hg/~awalls/cx18-perf/

These patches remove all the sleeps from incoming buffer handling
(unless your system starts getting very far behind, in which case a
fallback strategy starts letting sleeps happen again).


If you still have performance problems, there is one more patch I can
add, that avoids some sleeps in the new work handler threads that pass
empty buffers back out to the firmware. A copy of that patch is here:

http://linuxtv.org/hg/~awalls/cx18/rev/b42156ceee11



The trade-off I had to make with all these patches was to have the
cx18-driver prefer to "spin" rather than "sleep" when waiting for a
resource (i.e. the capture stream buffer queues), while handling
incoming buffers. This makes the live playback much nicer, but at the
expense of CPU cycles and perhaps total system throughput for other
things. I'd be interested in how a multicard multistream capture fares.



BTW, the above cx18-perf repo is missing a very small patch to fix a
recent bug with line-in audio not working. If you need line-in audio to
work during testing, a patch is in the main v4l-dvb repo already:

http://linuxtv.org/hg/v4l-dvb/rev/d19938a76e7a


Regards,
Andy


_______________________________________________
ivtv-devel mailing list
ivtv-devel [at] ivtvdriver
http://ivtvdriver.org/mailman/listinfo/ivtv-devel


bcjenkins at tvwhere

Apr 24, 2009, 5:28 AM

Post #12 of 13 (4531 views)
Permalink
Re: Problems with Hauppauge HVR 1600 and cx18 driver [In reply to]

On Mon, Apr 13, 2009 at 11:26 PM, Andy Walls <awalls [at] radix> wrote:
> On Tue, 2009-03-31 at 06:02 -0400, Brandon Jenkins wrote:
>> >>
>> >> Corey and Brandon,
>> >>
>> >> I found a race condition between the cx driver and the CX23418 firmware.
>> >> I have a patch that mitigates the problem here:
>> >>
>> >> http://linuxtv.org/hg/~awalls/cx18/rev/9f5f44e0ce6c
>> >>
>> >> I think the final form of the patch could be better.  However, this
>> >> patch essentially eliminated any artifacts I was getting playing back
>> >> digital TV.  I also had positive results running mplayer without the
>> >> "-cache" command line for both digital and analog captures.
>> >>
>> >> I haven't tested on a single processor machine, nor in a multicard
>> >> setup, but things looked good enough that I thought it ready for test by
>> >> others.
>> >>
>> >> Let me know if it helps or not.
>> >>
>> >> Regards,
>> >> Andy
>> >>
>
>> Andy,
>>
>> Based on continued discussions it seems you're still exploring things.
>> I can tell you that the analog captures are still exhibiting
>> artifacts. I'll get to some of the HD captures tonight.
>>
>> Brandon
>
> Brandon and Corey,
>
> I have a series of changes to improve performance of the cx18 driver in
> delivering incoming buffers to applications.  Please test the code here
> if you'd like:
>
> http://linuxtv.org/hg/~awalls/cx18-perf/
>
> These patches remove all the sleeps from incoming buffer handling
> (unless your system starts getting very far behind, in which case a
> fallback strategy starts letting sleeps happen again).
>
>
> If you still have performance problems, there is one more patch I can
> add, that avoids some sleeps in the new work handler threads that pass
> empty buffers back out to the firmware.  A copy of that patch is here:
>
> http://linuxtv.org/hg/~awalls/cx18/rev/b42156ceee11
>
>
>
> The trade-off I had to make with all these patches was to have the
> cx18-driver prefer to "spin" rather than "sleep" when waiting for a
> resource (i.e. the capture stream buffer queues), while handling
> incoming buffers.  This makes the live playback much nicer, but at the
> expense of CPU cycles and perhaps total system throughput for other
> things.  I'd be interested in how a multicard multistream capture fares.
>
>
>
> BTW, the above cx18-perf repo is missing a very small patch to fix a
> recent bug with line-in audio not working.  If you need line-in audio to
> work during testing, a patch is in the main v4l-dvb repo already:
>
> http://linuxtv.org/hg/v4l-dvb/rev/d19938a76e7a
>
>
> Regards,
> Andy
>
>

Andy,

I apologize for the delay in this email. I have been fighting an issue
with lirc which has preoccupied my time (it is amazing how ticked the
family gets when they can't watch TV!) I have mitigated that issue and
have been running your updated drivers for a couple of days with
marked improvement. I am seeing some messages in dmesg (normal?) and I
have attached it along with lspci, lsusb, and lsmod. The majority of
our recordings have been making use of the digital connection with OTA
ATSC. Also, because of my reduced tuner control issues (lirc) I have
not run any simultaneous captures with the HDPVR active.

The system is running Ubuntu Jaunty RC1 9.04 fully patched and kernel
- Linux sagetv-server 2.6.28-11-server #42-Ubuntu SMP Fri Apr 17
02:45:36 UTC 2009 x86_64 GNU/Linux

I use SageTV for PVR which uses the drivers in 32-bit compatible mode.

Brandon
Attachments: info.tgz (18.0 KB)


awalls at radix

Apr 24, 2009, 6:42 PM

Post #13 of 13 (4506 views)
Permalink
Re: Problems with Hauppauge HVR 1600 and cx18 driver [In reply to]

On Fri, 2009-04-24 at 08:28 -0400, Brandon Jenkins wrote:
> On Mon, Apr 13, 2009 at 11:26 PM, Andy Walls <awalls [at] radix> wrote:
> > On Tue, 2009-03-31 at 06:02 -0400, Brandon Jenkins wrote:

> > Brandon and Corey,
> >
> > I have a series of changes to improve performance of the cx18 driver in
> > delivering incoming buffers to applications. Please test the code here
> > if you'd like:
> >
> > http://linuxtv.org/hg/~awalls/cx18-perf/
> >
> > These patches remove all the sleeps from incoming buffer handling
> > (unless your system starts getting very far behind, in which case a
> > fallback strategy starts letting sleeps happen again).


> > The trade-off I had to make with all these patches was to have the
> > cx18-driver prefer to "spin" rather than "sleep" when waiting for a
> > resource (i.e. the capture stream buffer queues), while handling
> > incoming buffers. This makes the live playback much nicer, but at the
> > expense of CPU cycles and perhaps total system throughput for other
> > things. I'd be interested in how a multicard multistream capture fares.


> Andy,
>
> have been running your updated drivers for a couple of days with
> marked improvement.

Good to hear.


> I am seeing some messages in dmesg (normal?) and I
> have attached it along with lspci, lsusb, and lsmod. The majority of
> our recordings have been making use of the digital connection with OTA
> ATSC.

The messages indicate that the cx18 driver's interrupt handler is not
being called in time to service a CX23418 interrupt for an incoming
capture buffer. This indicates a latency prbolem in the system.

I also will note that the missed buffers are happening at inervals that
are somewhat far apart: ~1000 seconds, ~400 seconds, ~200 seconds, ~1600
seconds, etc...


1. Since you have a 4 core system that looks to be pretty high-end, I'll
assert the fundamental trade-off I mention above, about the cx18 driver
spinning vs. sleeping when handling incoming buffers in the work
handler, is likely not the cause.

2. Since the cx18_irq_handler() only seems to be called late in the case
of interrupts from cx18-1 and not cx18-0 nor cx18-2, I'll assert that a
driver servicing hardware that shares the interrupt line with cx18-1 is
likely involved.

IRQ 20 is shared by
cx18-0 at PCI 0000:05:00.0

IRQ 18 is shared by
cx18-2 at PCI 0000:05:02.0
usb hub 5 at PCI 0000:00:1a.7 (no usb devices connected)
usb hub 8 at PCI 0000:00:1d.2 (no usb devices connected)

IRQ 19 is shared by
cx18-1 at PCI
usb hub 7 at PCI 0000:00:1d.2
a usb device, Cygnal Systems (microcontroller?), is connected (commandIR?)
ahci disk controller at PCI 0000:03:00.0 (1? disk connected)
ahci disk controller at PCI 0000:00:1f.2 (a few disks connected)
(looks to be using Message Signalled Interrupts at IRQ 2299 though)


My hypothesis is that the ahci_interrupt() handler routine in
linux/drivers/ata/achi.c is not acknowledging IRQ 19 in a timely manner
and not finishing up interrupt service rapidly under certain conditions.
And looking at that routine, it does *all* its work first, before
clearing the interrupt line from the AHCI controller. The
achi_interrupt() routine acquires a spinlock of its own and then some of
the routines it calls can also try to acquire spinlocks. One can
hypothesize that the achi_interrupt() routine might occasionally take a
long time to complete.

While the ahci_interrupt() handler is doing its work and not clearing
the disk controller interrupt line on IRQ 19 and returning, any CX23418
interrupts on IRQ 19 will go unserviced during that time. This is how
the cx18 driver could miss an incoming buffer, as the CX23418 won't wait
long for the cx18 driver to pick up the buffer id from the mailbox.



Things you could try:

1. Setting the priority of cx18-1 lower than cx18-0 and cx18-2 when
SageTV has a choice of which card to use for video captures. Also note
if you ever see the message for cx18-0 or cx18-2 and how often.

2. Record TV programs to a disk other then the one hanging off of that
disk controller at PCI 3:00.0. That would include temp storage used
during the recording process (e.g. /tmp/... /var/... ). The goal is to
keep that disk quiescent during captures from cx18-1.

3. Move the disk controller in question to another PCI slot. See if the
problem leaves the CX23418 on IRQ 19, cx18-1, (and maybe begins to
affect another CX23418 if the disk controller gets IRQ 18 or IRQ 20 as
its new IRQ).

4. Move that one disk to a different disk controller that isn't using
IRQ 18, 19, or 20.

5. Try increasing the PCI latency timer larger than 64 PCI bus clocks on
the PCI bridge which the CX23418 and AHCI disk controller are behind.
That *might* help the ahci_interrupt() handler to finish it's work a
little earlier, and *maybe* mitigate things.

6. Contact the AHCI driver maintainer and ask him to help you confirm or
refute my hypothesis -- I don't know the most efficient and safest way
to verify my hypothesis by twiddling in the ahci driver. The ahci.c
file has this info:

* Maintained by: Jeff Garzik jgarzik * pobox.com
* Please ALWAYS copy linux-ide [at] vger
* on emails.




> The system is running Ubuntu Jaunty RC1 9.04 fully patched and kernel
> - Linux sagetv-server 2.6.28-11-server #42-Ubuntu SMP Fri Apr 17
> 02:45:36 UTC 2009 x86_64 GNU/Linux
>
> I use SageTV for PVR which uses the drivers in 32-bit compatible mode.

This probably doesn't matter. This is a latency issue from:

the time the CX23418 asserts its PCI bus interrupt line

to

the time the cx18 driver's irq handler can read the buffer id
information from the CX23418 mailbox.

I have thoroughly tweaked the cx18_irq_handler()'s timeline in previous
changesets. There are literally no improvements to be made in it.

You either have all your CPU's busy (doubtful), or a shared interrupt
line that isn't being serivced quickly by all the device drivers
handling that line. I can't think of anything else.

Regards,
Andy

> Brandon


_______________________________________________
ivtv-devel mailing list
ivtv-devel [at] ivtvdriver
http://ivtvdriver.org/mailman/listinfo/ivtv-devel

ivtv devel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.