Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

[PATCH v3 02/12] ARM: edma: Don't clear EMR of channel in edma_stop

 

 

Linux kernel RSS feed   Index | Next | Previous | View Threaded


joelf at ti

Aug 5, 2013, 9:14 AM

Post #1 of 5 (17 views)
Permalink
[PATCH v3 02/12] ARM: edma: Don't clear EMR of channel in edma_stop

We certainly don't want error conditions to be cleared any other
place but the EDMA error handler, as this will make us 'forget'
about missed events we might need to know errors have occurred.

This fixes a race condition where the EMR was being cleared
by the transfer completion interrupt handler.

Basically, what was happening was:

Missed event
|
|
V
SG1-SG2-SG3-Null
\
\__TC Interrupt (Almost same time as ARM is executing
TC interrupt handler, an event got missed and also forgotten
by clearing the EMR).

This causes the following problems:

1.
If error interrupt is also pending and TC interrupt clears the EMR
by calling edma_stop as has been observed in the edma_callback function,
the ARM will execute the error interrupt even though the EMR is clear.
As a result, the dma_ccerr_handler returns IRQ_NONE. If this happens
enough number of times, IRQ subsystem disables the interrupt thinking
its spurious which makes error handler never execute again.

2.
Also even if error handler doesn't return IRQ_NONE, the removing of EMR
removes the knowledge about which channel had a missed event, and thus
a manual trigger on such channels cannot be performed.

The EMR is ultimately being cleared by the Error interrupt handler
once it is handled so we remove code that does it in edma_stop and
allow it to happen there.

Signed-off-by: Joel Fernandes <joelf [at] ti>
---
arch/arm/common/edma.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
index 3567ba1..6433b6c 100644
--- a/arch/arm/common/edma.c
+++ b/arch/arm/common/edma.c
@@ -1307,7 +1307,6 @@ void edma_stop(unsigned channel)
edma_shadow0_write_array(ctlr, SH_EECR, j, mask);
edma_shadow0_write_array(ctlr, SH_ECR, j, mask);
edma_shadow0_write_array(ctlr, SH_SECR, j, mask);
- edma_write_array(ctlr, EDMA_EMCR, j, mask);

pr_debug("EDMA: EER%d %08x\n", j,
edma_shadow0_read_array(ctlr, SH_EER, j));
--
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


nsekhar at ti

Aug 8, 2013, 4:49 AM

Post #2 of 5 (17 views)
Permalink
Re: [PATCH v3 02/12] ARM: edma: Don't clear EMR of channel in edma_stop [In reply to]

On Monday 05 August 2013 09:44 PM, Joel Fernandes wrote:
> We certainly don't want error conditions to be cleared any other
> place but the EDMA error handler, as this will make us 'forget'
> about missed events we might need to know errors have occurred.
>
> This fixes a race condition where the EMR was being cleared
> by the transfer completion interrupt handler.
>
> Basically, what was happening was:
>
> Missed event
> |
> |
> V
> SG1-SG2-SG3-Null
> \
> \__TC Interrupt (Almost same time as ARM is executing
> TC interrupt handler, an event got missed and also forgotten
> by clearing the EMR).
>
> This causes the following problems:
>
> 1.
> If error interrupt is also pending and TC interrupt clears the EMR
> by calling edma_stop as has been observed in the edma_callback function,
> the ARM will execute the error interrupt even though the EMR is clear.
> As a result, the dma_ccerr_handler returns IRQ_NONE. If this happens
> enough number of times, IRQ subsystem disables the interrupt thinking
> its spurious which makes error handler never execute again.
>
> 2.
> Also even if error handler doesn't return IRQ_NONE, the removing of EMR
> removes the knowledge about which channel had a missed event, and thus
> a manual trigger on such channels cannot be performed.
>
> The EMR is ultimately being cleared by the Error interrupt handler
> once it is handled so we remove code that does it in edma_stop and
> allow it to happen there.
>
> Signed-off-by: Joel Fernandes <joelf [at] ti>

Queuing this for v3.11 fixes. While committing, I changed the headline
to remove capitalization and made it more readable by removing register
level details. The new headline is:

ARM: edma: don't clear missed events in edma_stop()

Thanks,
Sekhar

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


nsekhar at ti

Aug 11, 2013, 9:25 PM

Post #3 of 5 (12 views)
Permalink
Re: [PATCH v3 02/12] ARM: edma: Don't clear EMR of channel in edma_stop [In reply to]

On 8/8/2013 5:19 PM, Sekhar Nori wrote:
> On Monday 05 August 2013 09:44 PM, Joel Fernandes wrote:
>> We certainly don't want error conditions to be cleared any other
>> place but the EDMA error handler, as this will make us 'forget'
>> about missed events we might need to know errors have occurred.
>>
>> This fixes a race condition where the EMR was being cleared
>> by the transfer completion interrupt handler.
>>
>> Basically, what was happening was:
>>
>> Missed event
>> |
>> |
>> V
>> SG1-SG2-SG3-Null
>> \
>> \__TC Interrupt (Almost same time as ARM is executing
>> TC interrupt handler, an event got missed and also forgotten
>> by clearing the EMR).
>>
>> This causes the following problems:
>>
>> 1.
>> If error interrupt is also pending and TC interrupt clears the EMR
>> by calling edma_stop as has been observed in the edma_callback function,
>> the ARM will execute the error interrupt even though the EMR is clear.
>> As a result, the dma_ccerr_handler returns IRQ_NONE. If this happens
>> enough number of times, IRQ subsystem disables the interrupt thinking
>> its spurious which makes error handler never execute again.
>>
>> 2.
>> Also even if error handler doesn't return IRQ_NONE, the removing of EMR
>> removes the knowledge about which channel had a missed event, and thus
>> a manual trigger on such channels cannot be performed.
>>
>> The EMR is ultimately being cleared by the Error interrupt handler
>> once it is handled so we remove code that does it in edma_stop and
>> allow it to happen there.
>>
>> Signed-off-by: Joel Fernandes <joelf [at] ti>
>
> Queuing this for v3.11 fixes. While committing, I changed the headline
> to remove capitalization and made it more readable by removing register
> level details. The new headline is:
>
> ARM: edma: don't clear missed events in edma_stop()

Forgot to ask, should this be tagged for stable? IOW, how serious is
this race in current kernel (without the entire series applied)? I have
never observed it myself - so please provide details how easy/difficult
it is to hit this condition.

Thanks,
Sekhar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


agnel.joel at gmail

Aug 11, 2013, 9:29 PM

Post #4 of 5 (12 views)
Permalink
Re: [PATCH v3 02/12] ARM: edma: Don't clear EMR of channel in edma_stop [In reply to]

On Sun, Aug 11, 2013 at 11:25 PM, Sekhar Nori <nsekhar [at] ti> wrote:
> On 8/8/2013 5:19 PM, Sekhar Nori wrote:
>> On Monday 05 August 2013 09:44 PM, Joel Fernandes wrote:
>>> We certainly don't want error conditions to be cleared any other
>>> place but the EDMA error handler, as this will make us 'forget'
>>> about missed events we might need to know errors have occurred.
>>>
>>> This fixes a race condition where the EMR was being cleared
>>> by the transfer completion interrupt handler.
>>>
>>> Basically, what was happening was:
>>>
>>> Missed event
>>> |
>>> |
>>> V
>>> SG1-SG2-SG3-Null
>>> \
>>> \__TC Interrupt (Almost same time as ARM is executing
>>> TC interrupt handler, an event got missed and also forgotten
>>> by clearing the EMR).
>>>
>>> This causes the following problems:
>>>
>>> 1.
>>> If error interrupt is also pending and TC interrupt clears the EMR
>>> by calling edma_stop as has been observed in the edma_callback function,
>>> the ARM will execute the error interrupt even though the EMR is clear.
>>> As a result, the dma_ccerr_handler returns IRQ_NONE. If this happens
>>> enough number of times, IRQ subsystem disables the interrupt thinking
>>> its spurious which makes error handler never execute again.
>>>
>>> 2.
>>> Also even if error handler doesn't return IRQ_NONE, the removing of EMR
>>> removes the knowledge about which channel had a missed event, and thus
>>> a manual trigger on such channels cannot be performed.
>>>
>>> The EMR is ultimately being cleared by the Error interrupt handler
>>> once it is handled so we remove code that does it in edma_stop and
>>> allow it to happen there.
>>>
>>> Signed-off-by: Joel Fernandes <joelf [at] ti>
>>
>> Queuing this for v3.11 fixes. While committing, I changed the headline
>> to remove capitalization and made it more readable by removing register
>> level details. The new headline is:
>>
>> ARM: edma: don't clear missed events in edma_stop()
>
> Forgot to ask, should this be tagged for stable? IOW, how serious is
> this race in current kernel (without the entire series applied)? I have
> never observed it myself - so please provide details how easy/difficult
> it is to hit this condition.

The race was uncovered by recent EDMA patch series, So this patch can
go in for next kernel release as such, I am not aware of any other DMA
user that maybe uncovering the race condition.

Thanks,

-Joel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


nsekhar at ti

Aug 11, 2013, 11:24 PM

Post #5 of 5 (12 views)
Permalink
Re: [PATCH v3 02/12] ARM: edma: Don't clear EMR of channel in edma_stop [In reply to]

On Monday 12 August 2013 09:59 AM, Joel Fernandes wrote:
> On Sun, Aug 11, 2013 at 11:25 PM, Sekhar Nori <nsekhar [at] ti> wrote:
>> On 8/8/2013 5:19 PM, Sekhar Nori wrote:
>>> On Monday 05 August 2013 09:44 PM, Joel Fernandes wrote:
>>>> We certainly don't want error conditions to be cleared any other
>>>> place but the EDMA error handler, as this will make us 'forget'
>>>> about missed events we might need to know errors have occurred.
>>>>
>>>> This fixes a race condition where the EMR was being cleared
>>>> by the transfer completion interrupt handler.
>>>>
>>>> Basically, what was happening was:
>>>>
>>>> Missed event
>>>> |
>>>> |
>>>> V
>>>> SG1-SG2-SG3-Null
>>>> \
>>>> \__TC Interrupt (Almost same time as ARM is executing
>>>> TC interrupt handler, an event got missed and also forgotten
>>>> by clearing the EMR).
>>>>
>>>> This causes the following problems:
>>>>
>>>> 1.
>>>> If error interrupt is also pending and TC interrupt clears the EMR
>>>> by calling edma_stop as has been observed in the edma_callback function,
>>>> the ARM will execute the error interrupt even though the EMR is clear.
>>>> As a result, the dma_ccerr_handler returns IRQ_NONE. If this happens
>>>> enough number of times, IRQ subsystem disables the interrupt thinking
>>>> its spurious which makes error handler never execute again.
>>>>
>>>> 2.
>>>> Also even if error handler doesn't return IRQ_NONE, the removing of EMR
>>>> removes the knowledge about which channel had a missed event, and thus
>>>> a manual trigger on such channels cannot be performed.
>>>>
>>>> The EMR is ultimately being cleared by the Error interrupt handler
>>>> once it is handled so we remove code that does it in edma_stop and
>>>> allow it to happen there.
>>>>
>>>> Signed-off-by: Joel Fernandes <joelf [at] ti>
>>>
>>> Queuing this for v3.11 fixes. While committing, I changed the headline
>>> to remove capitalization and made it more readable by removing register
>>> level details. The new headline is:
>>>
>>> ARM: edma: don't clear missed events in edma_stop()
>>
>> Forgot to ask, should this be tagged for stable? IOW, how serious is
>> this race in current kernel (without the entire series applied)? I have
>> never observed it myself - so please provide details how easy/difficult
>> it is to hit this condition.
>
> The race was uncovered by recent EDMA patch series, So this patch can
> go in for next kernel release as such, I am not aware of any other DMA
> user that maybe uncovering the race condition.

Okay, I wont queue for -rc then. If Vinod wants to take this along with
rest of the series, you can add my:

Acked-by: Sekhar Nori <nsekhar [at] ti>

Thanks,
Sekhar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.