Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

[PATCH]: PCI: GART iommu alignment fixes [v2]

 

 

First page Previous page 1 2 Next page Last page  View All Linux kernel RSS feed   Index | Next | Previous | View Threaded


prarit at redhat

Jul 23, 2008, 4:19 AM

Post #1 of 43 (758 views)
Permalink
[PATCH]: PCI: GART iommu alignment fixes [v2]

pci_alloc_consistent/dma_alloc_coherent does not return size aligned
addresses.

From Documentation/DMA-mapping.txt:

"pci_alloc_consistent returns two values: the virtual address which you
can use to access it from the CPU and dma_handle which you pass to the
card.

The cpu return address and the DMA bus master address are both
guaranteed to be aligned to the smallest PAGE_SIZE order which
is greater than or equal to the requested size. This invariant
exists (for example) to guarantee that if you allocate a chunk
which is smaller than or equal to 64 kilobytes, the extent of the
buffer you receive will not cross a 64K boundary."

1. Modify alloc_iommu to allow for an alignment mask
2. Modify pci_gart_simple to return size-aligned values.
3. Fixup the alignment calculation in the iommu-helper code.
4. Fix possible overflow in alloc_iommu's boundary_size calculation.
(It is possible that alloc_iommu()'s boundary_size overflows as
dma_get_seg_boundary can return 0xffffffff. In that case, further usage of
boundary_size triggers a BUG_ON() in the iommu code.)

End result: When allocating from IOMMU, pci_alloc_consistent/dma_alloc_coherent
will now return a size aligned value.

Signed-off-by: Prarit Bhargava <prarit [at] redhat>

diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c
index faf3229..717ae64 100644
--- a/arch/x86/kernel/pci-gart_64.c
+++ b/arch/x86/kernel/pci-gart_64.c
@@ -83,7 +83,8 @@ AGPEXTERN __u32 *agp_gatt_table;
static unsigned long next_bit; /* protected by iommu_bitmap_lock */
static int need_flush; /* global flush state. set for each gart wrap */

-static unsigned long alloc_iommu(struct device *dev, int size)
+static unsigned long alloc_iommu(struct device *dev, int size,
+ unsigned long mask)
{
unsigned long offset, flags;
unsigned long boundary_size;
@@ -91,16 +92,17 @@ static unsigned long alloc_iommu(struct device *dev, int size)

base_index = ALIGN(iommu_bus_base & dma_get_seg_boundary(dev),
PAGE_SIZE) >> PAGE_SHIFT;
- boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
+ boundary_size = ALIGN((unsigned long long)dma_get_seg_boundary(dev) + 1,
PAGE_SIZE) >> PAGE_SHIFT;

spin_lock_irqsave(&iommu_bitmap_lock, flags);
offset = iommu_area_alloc(iommu_gart_bitmap, iommu_pages, next_bit,
- size, base_index, boundary_size, 0);
+ size, base_index, boundary_size, mask);
if (offset == -1) {
need_flush = 1;
offset = iommu_area_alloc(iommu_gart_bitmap, iommu_pages, 0,
- size, base_index, boundary_size, 0);
+ size, base_index, boundary_size,
+ mask);
}
if (offset != -1) {
set_bit_string(iommu_gart_bitmap, offset, size);
@@ -240,10 +242,11 @@ nonforced_iommu(struct device *dev, unsigned long addr, size_t size)
* Caller needs to check if the iommu is needed and flush.
*/
static dma_addr_t dma_map_area(struct device *dev, dma_addr_t phys_mem,
- size_t size, int dir)
+ size_t size, int dir, u64 align_mask)
{
unsigned long npages = to_pages(phys_mem, size);
- unsigned long iommu_page = alloc_iommu(dev, npages);
+ unsigned long palign_mask = align_mask >> PAGE_SHIFT;
+ unsigned long iommu_page = alloc_iommu(dev, npages, palign_mask);
int i;

if (iommu_page == -1) {
@@ -266,7 +269,8 @@ static dma_addr_t dma_map_area(struct device *dev, dma_addr_t phys_mem,
static dma_addr_t
gart_map_simple(struct device *dev, char *buf, size_t size, int dir)
{
- dma_addr_t map = dma_map_area(dev, virt_to_bus(buf), size, dir);
+ dma_addr_t map = dma_map_area(dev, virt_to_bus(buf), size, dir,
+ size - 1);

flush_gart();

@@ -286,7 +290,8 @@ gart_map_single(struct device *dev, void *addr, size_t size, int dir)
if (!need_iommu(dev, phys_mem, size))
return phys_mem;

- bus = gart_map_simple(dev, addr, size, dir);
+ dma_addr_t map = dma_map_area(dev, virt_to_bus(addr), size, dir, 0);
+ flush_gart();

return bus;
}
@@ -345,7 +350,7 @@ static int dma_map_sg_nonforce(struct device *dev, struct scatterlist *sg,
unsigned long addr = sg_phys(s);

if (nonforced_iommu(dev, addr, s->length)) {
- addr = dma_map_area(dev, addr, s->length, dir);
+ addr = dma_map_area(dev, addr, s->length, dir, 0);
if (addr == bad_dma_address) {
if (i > 0)
gart_unmap_sg(dev, sg, i, dir);
@@ -367,7 +372,7 @@ static int __dma_map_cont(struct device *dev, struct scatterlist *start,
int nelems, struct scatterlist *sout,
unsigned long pages)
{
- unsigned long iommu_start = alloc_iommu(dev, pages);
+ unsigned long iommu_start = alloc_iommu(dev, pages, 0);
unsigned long iommu_page = iommu_start;
struct scatterlist *s;
int i;
diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c
index a3b8d4c..39940e7 100644
--- a/lib/iommu-helper.c
+++ b/lib/iommu-helper.c
@@ -16,7 +16,7 @@ again:
index = find_next_zero_bit(map, size, start);

/* Align allocation */
- index = (index + align_mask) & ~align_mask;
+ index = __ALIGN_MASK(index, align_mask);

end = index + nr;
if (end >= size)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


joro at 8bytes

Jul 23, 2008, 3:10 PM

Post #2 of 43 (725 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

On Wed, Jul 23, 2008 at 07:19:43AM -0400, Prarit Bhargava wrote:
> pci_alloc_consistent/dma_alloc_coherent does not return size aligned
> addresses.
>
> >From Documentation/DMA-mapping.txt:
>
> "pci_alloc_consistent returns two values: the virtual address which you
> can use to access it from the CPU and dma_handle which you pass to the
> card.
>
> The cpu return address and the DMA bus master address are both
> guaranteed to be aligned to the smallest PAGE_SIZE order which
> is greater than or equal to the requested size. This invariant
> exists (for example) to guarantee that if you allocate a chunk
> which is smaller than or equal to 64 kilobytes, the extent of the
> buffer you receive will not cross a 64K boundary."

Interesting. Have you experienced any problems because of that
misbehavior in the GART code? AMD IOMMU currently also violates this
requirement. I will send a patch to fix that there too.

Joerg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


fujita.tomonori at lab

Jul 23, 2008, 4:14 PM

Post #3 of 43 (723 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

On Thu, 24 Jul 2008 00:10:33 +0200
Joerg Roedel <joro [at] 8bytes> wrote:

> On Wed, Jul 23, 2008 at 07:19:43AM -0400, Prarit Bhargava wrote:
> > pci_alloc_consistent/dma_alloc_coherent does not return size aligned
> > addresses.
> >
> > >From Documentation/DMA-mapping.txt:
> >
> > "pci_alloc_consistent returns two values: the virtual address which you
> > can use to access it from the CPU and dma_handle which you pass to the
> > card.
> >
> > The cpu return address and the DMA bus master address are both
> > guaranteed to be aligned to the smallest PAGE_SIZE order which
> > is greater than or equal to the requested size. This invariant
> > exists (for example) to guarantee that if you allocate a chunk
> > which is smaller than or equal to 64 kilobytes, the extent of the
> > buffer you receive will not cross a 64K boundary."
>
> Interesting. Have you experienced any problems because of that
> misbehavior in the GART code? AMD IOMMU currently also violates this
> requirement. I will send a patch to fix that there too.

IIRC, only PARISC and POWER IOMMUs follow the above rule. So I also
wondered what problem he hit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


fujita.tomonori at lab

Jul 23, 2008, 4:23 PM

Post #4 of 43 (721 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

On Wed, 23 Jul 2008 07:19:43 -0400
Prarit Bhargava <prarit [at] redhat> wrote:

> pci_alloc_consistent/dma_alloc_coherent does not return size aligned
> addresses.
>
> From Documentation/DMA-mapping.txt:
>
> "pci_alloc_consistent returns two values: the virtual address which you
> can use to access it from the CPU and dma_handle which you pass to the
> card.
>
> The cpu return address and the DMA bus master address are both
> guaranteed to be aligned to the smallest PAGE_SIZE order which
> is greater than or equal to the requested size. This invariant
> exists (for example) to guarantee that if you allocate a chunk
> which is smaller than or equal to 64 kilobytes, the extent of the
> buffer you receive will not cross a 64K boundary."
>
> 1. Modify alloc_iommu to allow for an alignment mask
> 2. Modify pci_gart_simple to return size-aligned values.
> 3. Fixup the alignment calculation in the iommu-helper code.

Hmm, you don't fix anything in the helper code. You just use
__ALIGN_MASK.

And please don't use __ALIGN_MASK. You will notice that no one in
mainline use __ALIGN_MASK. Andrew said that we should not.

Ideally, we should use ALIGN here but the current code was accepted
because it is pretty simple and we have an comment what we do here. So
please let it alone.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


prarit at redhat

Jul 23, 2008, 4:24 PM

Post #5 of 43 (720 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

FUJITA Tomonori wrote:
> On Wed, 23 Jul 2008 07:19:43 -0400
> Prarit Bhargava <prarit [at] redhat> wrote:
>
>
>> pci_alloc_consistent/dma_alloc_coherent does not return size aligned
>> addresses.
>>
>> From Documentation/DMA-mapping.txt:
>>
>> "pci_alloc_consistent returns two values: the virtual address which you
>> can use to access it from the CPU and dma_handle which you pass to the
>> card.
>>
>> The cpu return address and the DMA bus master address are both
>> guaranteed to be aligned to the smallest PAGE_SIZE order which
>> is greater than or equal to the requested size. This invariant
>> exists (for example) to guarantee that if you allocate a chunk
>> which is smaller than or equal to 64 kilobytes, the extent of the
>> buffer you receive will not cross a 64K boundary."
>>
>> 1. Modify alloc_iommu to allow for an alignment mask
>> 2. Modify pci_gart_simple to return size-aligned values.
>> 3. Fixup the alignment calculation in the iommu-helper code.
>>
>
> Hmm, you don't fix anything in the helper code. You just use
> __ALIGN_MASK.
>
>
Yeah -- I meant "fixup" as opposed to fix.

> And please don't use __ALIGN_MASK. You will notice that no one in
> mainline use __ALIGN_MASK. Andrew said that we should not.
>
> Ideally, we should use ALIGN here but the current code was accepted
> because it is pretty simple and we have an comment what we do here. So
> please let it alone.
>

I'll resubmit shortly ...

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


prarit at redhat

Jul 23, 2008, 4:47 PM

Post #6 of 43 (719 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

>> Interesting. Have you experienced any problems because of that
>> misbehavior in the GART code? AMD IOMMU currently also violates this
>> requirement. I will send a patch to fix that there too.
>>
>
>

Joerg, yes I can see misbehavior caused by this code. O/w I wouldn't
be spending my time fixing it :) :)

See below ....

> IIRC, only PARISC and POWER IOMMUs follow the above rule. So I also
> wondered what problem he hit.
>

I wonder if IBM's Calgary IOMMU needs this fix? ... I've added Ed
Pollard to find out.

On big memory footprint (16G or above) systems it is possible that the
e820 map reserves most of the lower 4G of memory for system use*. So
it's possible that the 4G region is almost completely reserved at boot
time and so the kernel starts using the IOMMU for DMA (see
dma_alloc_coherent()). The addresses returned are not properly aligned,
and this causes serious problems for some drivers that require a
physical aligned address for the device.

P.

* I have one large system with 64G of memory on which I can reproduce
this issue very quickly. Even booting the system with mem=16G seems to
cause the problem, although I did have to load a module that reserved a
few M of DMA addresses before I started alloc'ing from the IOMMU.

I also reproduced this on a smaller system by loading one module that
reserved as much DMA-able region as possible, and then loaded another
module that reserved from the IOMMU. While this situation is a bit
contrived the bug still hit -- the returned addresses are not properly
aligned.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


joro at 8bytes

Jul 24, 2008, 12:46 AM

Post #7 of 43 (719 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

On Wed, Jul 23, 2008 at 07:47:03PM -0400, Prarit Bhargava wrote:
>
> >>Interesting. Have you experienced any problems because of that
> >>misbehavior in the GART code? AMD IOMMU currently also violates this
> >>requirement. I will send a patch to fix that there too.
> >>
> >
> >
>
> Joerg, yes I can see misbehavior caused by this code. O/w I wouldn't
> be spending my time fixing it :) :)
>
> See below ....
>
> >IIRC, only PARISC and POWER IOMMUs follow the above rule. So I also
> >wondered what problem he hit.
> >
>
> I wonder if IBM's Calgary IOMMU needs this fix? ... I've added Ed
> Pollard to find out.
>
> On big memory footprint (16G or above) systems it is possible that the
> e820 map reserves most of the lower 4G of memory for system use*. So
> it's possible that the 4G region is almost completely reserved at boot
> time and so the kernel starts using the IOMMU for DMA (see
> dma_alloc_coherent()). The addresses returned are not properly aligned,
> and this causes serious problems for some drivers that require a
> physical aligned address for the device.

Do you have a list of driver which require this? I would like to
reproduce this issue. Does it also happen when you start the kernel with
iommu=force (GART should then be used for all DMA remapping) too?

Joerg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


prarit at redhat

Jul 24, 2008, 3:09 AM

Post #8 of 43 (717 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

Joerg Roedel wrote:
> On Wed, Jul 23, 2008 at 07:47:03PM -0400, Prarit Bhargava wrote:
>
>>>> Interesting. Have you experienced any problems because of that
>>>> misbehavior in the GART code? AMD IOMMU currently also violates this
>>>> requirement. I will send a patch to fix that there too.
>>>>
>>>>
>>>
>>>
>> Joerg, yes I can see misbehavior caused by this code. O/w I wouldn't
>> be spending my time fixing it :) :)
>>
>> See below ....
>>
>>
>>> IIRC, only PARISC and POWER IOMMUs follow the above rule. So I also
>>> wondered what problem he hit.
>>>
>>>
>> I wonder if IBM's Calgary IOMMU needs this fix? ... I've added Ed
>> Pollard to find out.
>>
>> On big memory footprint (16G or above) systems it is possible that the
>> e820 map reserves most of the lower 4G of memory for system use*. So
>> it's possible that the 4G region is almost completely reserved at boot
>> time and so the kernel starts using the IOMMU for DMA (see
>> dma_alloc_coherent()). The addresses returned are not properly aligned,
>> and this causes serious problems for some drivers that require a
>> physical aligned address for the device.
>>
>
> Do you have a list of driver which require this?

No, I don't have a list. :(

But it seems that the skge driver suffers from this because this code
exists in the driver:

skge->mem = pci_alloc_consistent(hw->pdev, skge->mem_size,
&skge->dma);
if (!skge->mem)
return -ENOMEM;

BUG_ON(skge->dma & 7);

if ((u64)skge->dma >> 32 != ((u64) skge->dma + skge->mem_size)
>> 32) {
printk(KERN_ERR PFX "pci_alloc_consistent region crosses
4G boundary\n");
err = -EINVAL;
goto free_pci_mem;
}


If pci_alloc_consistent did the "right" thing, we should *never* see
that warning message.

In theory, any 32-bit device attempting to request larger than PAGE_SIZE
DMA memory on a system where no memory is available below 4G should show
this problem.

> I would like to
> reproduce this issue. Does it also happen when you start the kernel with
> iommu=force (GART should then be used for all DMA remapping) too?
>

Yes, this happens if you specify iommu=force on the command line.

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


fujita.tomonori at lab

Jul 24, 2008, 3:34 AM

Post #9 of 43 (714 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

On Thu, 24 Jul 2008 06:09:31 -0400
Prarit Bhargava <prarit [at] redhat> wrote:

>
>
> Joerg Roedel wrote:
> > On Wed, Jul 23, 2008 at 07:47:03PM -0400, Prarit Bhargava wrote:
> >
> >>>> Interesting. Have you experienced any problems because of that
> >>>> misbehavior in the GART code? AMD IOMMU currently also violates this
> >>>> requirement. I will send a patch to fix that there too.
> >>>>
> >>>>
> >>>
> >>>
> >> Joerg, yes I can see misbehavior caused by this code. O/w I wouldn't
> >> be spending my time fixing it :) :)
> >>
> >> See below ....
> >>
> >>
> >>> IIRC, only PARISC and POWER IOMMUs follow the above rule. So I also
> >>> wondered what problem he hit.
> >>>
> >>>
> >> I wonder if IBM's Calgary IOMMU needs this fix? ... I've added Ed
> >> Pollard to find out.
> >>
> >> On big memory footprint (16G or above) systems it is possible that the
> >> e820 map reserves most of the lower 4G of memory for system use*. So
> >> it's possible that the 4G region is almost completely reserved at boot
> >> time and so the kernel starts using the IOMMU for DMA (see
> >> dma_alloc_coherent()). The addresses returned are not properly aligned,
> >> and this causes serious problems for some drivers that require a
> >> physical aligned address for the device.
> >>
> >
> > Do you have a list of driver which require this?
>
> No, I don't have a list. :(
>
> But it seems that the skge driver suffers from this because this code
> exists in the driver:

seems? You hit the bug with this driver, right?


> skge->mem = pci_alloc_consistent(hw->pdev, skge->mem_size,
> &skge->dma);
> if (!skge->mem)
> return -ENOMEM;
>
> BUG_ON(skge->dma & 7);
>
> if ((u64)skge->dma >> 32 != ((u64) skge->dma + skge->mem_size)
> >> 32) {
> printk(KERN_ERR PFX "pci_alloc_consistent region crosses
> 4G boundary\n");
> err = -EINVAL;
> goto free_pci_mem;
> }
>
>
> If pci_alloc_consistent did the "right" thing, we should *never* see
> that warning message.

Well, I think that this is not releated with the pci_alloc_consistent
alignment problem that you talk about.

I think that the driver tries to avoid 4GB boundary crossing
problem. You can find some work to avoid this, for example:

http://www.ussg.iu.edu/hypermail/linux/kernel/0712.0/2206.html

pci_device_add() has the following code to avoid this:

pci_set_dma_seg_boundary(dev, 0xffffffff);

I suspect that the problem you talk about, alloc_consistent doesn't
return the reqeuested size aligned memory, breaks anything.


> In theory, any 32-bit device attempting to request larger than PAGE_SIZE
> DMA memory on a system where no memory is available below 4G should show
> this problem.
>
> > I would like to
> > reproduce this issue. Does it also happen when you start the kernel with
> > iommu=force (GART should then be used for all DMA remapping) too?
> >
>
> Yes, this happens if you specify iommu=force on the command line.
>
> P.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo [at] vger
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


joro at 8bytes

Jul 24, 2008, 5:37 AM

Post #10 of 43 (713 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

On Thu, Jul 24, 2008 at 07:34:35PM +0900, FUJITA Tomonori wrote:
> On Thu, 24 Jul 2008 06:09:31 -0400
> Prarit Bhargava <prarit [at] redhat> wrote:
> > skge->mem = pci_alloc_consistent(hw->pdev, skge->mem_size,
> > &skge->dma);
> > if (!skge->mem)
> > return -ENOMEM;
> >
> > BUG_ON(skge->dma & 7);
> >
> > if ((u64)skge->dma >> 32 != ((u64) skge->dma + skge->mem_size)
> > >> 32) {
> > printk(KERN_ERR PFX "pci_alloc_consistent region crosses
> > 4G boundary\n");
> > err = -EINVAL;
> > goto free_pci_mem;
> > }
> >
> >
> > If pci_alloc_consistent did the "right" thing, we should *never* see
> > that warning message.
>
> Well, I think that this is not releated with the pci_alloc_consistent
> alignment problem that you talk about.
>
> I think that the driver tries to avoid 4GB boundary crossing
> problem. You can find some work to avoid this, for example:
>
> http://www.ussg.iu.edu/hypermail/linux/kernel/0712.0/2206.html
>
> pci_device_add() has the following code to avoid this:
>
> pci_set_dma_seg_boundary(dev, 0xffffffff);
>
> I suspect that the problem you talk about, alloc_consistent doesn't
> return the reqeuested size aligned memory, breaks anything.

But I think Prarit is right with this change. If the interface defines
this behavior the IOMMU drivers have to implement it. I am just
wondering that the problem never showed up before. The GART driver is a
few years old now.

Joerg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


prarit at redhat

Jul 24, 2008, 5:49 AM

Post #11 of 43 (713 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

> But I think Prarit is right with this change. If the interface defines
> this behavior the IOMMU drivers have to implement it. I am just
> wondering that the problem never showed up before. The GART driver is a
> few years old now.
>
>

Joerg -- there's an easy explanation for this. This will only happen
when a 32-bit device requests DMA memory and all memory below 4G is
used. Just doing a quick overview of a few systems, allocated DMA
memory is usually less than 512M of the system memory so it is unlikely
a system hits the 4G limit.

In addition to that most systems do not reserve all or most of the lower
4G in the e820 maps. Those that do are usually larger systems.

ie) The only reason we're seeing this now is because large memory
footprint systems are coming online -- IMO ;)

P.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


fujita.tomonori at lab

Jul 24, 2008, 6:32 AM

Post #12 of 43 (711 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

On Thu, 24 Jul 2008 14:37:55 +0200
Joerg Roedel <joro [at] 8bytes> wrote:

> On Thu, Jul 24, 2008 at 07:34:35PM +0900, FUJITA Tomonori wrote:
> > On Thu, 24 Jul 2008 06:09:31 -0400
> > Prarit Bhargava <prarit [at] redhat> wrote:
> > > skge->mem = pci_alloc_consistent(hw->pdev, skge->mem_size,
> > > &skge->dma);
> > > if (!skge->mem)
> > > return -ENOMEM;
> > >
> > > BUG_ON(skge->dma & 7);
> > >
> > > if ((u64)skge->dma >> 32 != ((u64) skge->dma + skge->mem_size)
> > > >> 32) {
> > > printk(KERN_ERR PFX "pci_alloc_consistent region crosses
> > > 4G boundary\n");
> > > err = -EINVAL;
> > > goto free_pci_mem;
> > > }
> > >
> > >
> > > If pci_alloc_consistent did the "right" thing, we should *never* see
> > > that warning message.
> >
> > Well, I think that this is not releated with the pci_alloc_consistent
> > alignment problem that you talk about.
> >
> > I think that the driver tries to avoid 4GB boundary crossing
> > problem. You can find some work to avoid this, for example:
> >
> > http://www.ussg.iu.edu/hypermail/linux/kernel/0712.0/2206.html
> >
> > pci_device_add() has the following code to avoid this:
> >
> > pci_set_dma_seg_boundary(dev, 0xffffffff);
> >
> > I suspect that the problem you talk about, alloc_consistent doesn't
> > return the reqeuested size aligned memory, breaks anything.
>
> But I think Prarit is right with this change. If the interface defines
> this behavior the IOMMU drivers have to implement it. I am just
> wondering that the problem never showed up before. The GART driver is a
> few years old now.

Yeah, I'm not against fixing IOMMUs to make alloc_consistent return
the reqeuested size aligned memory. My point is that it's not likely
to fix anything. Even with the patch, we hit the above problem because
as I explained, the root cause of the problem is the boundary issue;
we need to fix pci-dma.c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


prarit at redhat

Jul 24, 2008, 7:31 AM

Post #13 of 43 (710 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

>> But I think Prarit is right with this change. If the interface defines
>> this behavior the IOMMU drivers have to implement it. I am just
>> wondering that the problem never showed up before. The GART driver is a
>> few years old now.
>>
>
> Yeah, I'm not against fixing IOMMUs to make alloc_consistent return
> the reqeuested size aligned memory. My point is that it's not likely
> to fix anything. Even with the patch, we hit the above problem because
> as I explained, the root cause of the problem is the boundary issue;
> we need to fix pci-dma.c
>
>

Sorry, I misquoted code up there and was looking for a clear example of
where this would happen. The issue I'm trying to resolve didn't happen
on the skge -- it was just a convenient piece of code to examine and
point out ;)

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


fujita.tomonori at lab

Jul 24, 2008, 7:40 AM

Post #14 of 43 (711 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

On Thu, 24 Jul 2008 10:31:58 -0400
Prarit Bhargava <prarit [at] redhat> wrote:

>
> >> But I think Prarit is right with this change. If the interface defines
> >> this behavior the IOMMU drivers have to implement it. I am just
> >> wondering that the problem never showed up before. The GART driver is a
> >> few years old now.
> >>
> >
> > Yeah, I'm not against fixing IOMMUs to make alloc_consistent return
> > the reqeuested size aligned memory. My point is that it's not likely
> > to fix anything. Even with the patch, we hit the above problem because
> > as I explained, the root cause of the problem is the boundary issue;
> > we need to fix pci-dma.c
> >
> >
>
> Sorry, I misquoted code up there and was looking for a clear example of
> where this would happen. The issue I'm trying to resolve didn't happen
> on the skge -- it was just a convenient piece of code to examine and
> point out ;)

So please tell us what driver you hit the bug with. Can you give us
the details?

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


prarit at redhat

Jul 24, 2008, 7:45 AM

Post #15 of 43 (710 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

>>>
>>
>> Yeah, I'm not against fixing IOMMUs to make alloc_consistent return
>> the reqeuested size aligned memory. My point is that it's not likely
>> to fix anything. Even with the patch, we hit the above problem because
>> as I explained, the root cause of the problem is the boundary issue;
>> we need to fix pci-dma.c
>>
>>
>
> Sorry, I misquoted code up there and was looking for a clear example
> of where this would happen. The issue I'm trying to resolve didn't
> happen on the skge -- it was just a convenient piece of code to
> examine and point out ;)
>

Here's a case where USB has problems:

BIOS-e820: 0000000000000000 - 000000000009f400 (usable)
BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000f57f6800 (reserved)
BIOS-e820: 00000000f57f6800 - 00000000f5800000 (ACPI data)
BIOS-e820: 00000000f5800000 - 00000000fdc00000 (reserved)
BIOS-e820: 00000000fdc00000 - 00000000fdc01000 (reserved)
BIOS-e820: 00000000fdc10000 - 00000000fdc01000 (reserved)
BIOS-e820: 00000000fdc20000 - 00000000fdc01000 (reserved)
BIOS-e820: 00000000fdc30000 - 00000000fdc01000 (reserved)
BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
BIOS-e820: 00000000fec10000 - 00000000fec20000 (reserved)
BIOS-e820: 00000000fec20000 - 00000000fec30000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
BIOS-e820: 00000000fee10000 - 00000000ff800000 (reserved)
BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 00000017fffff000 (usable)

I haven't tracked down what pci_alloc_consistent went wrong in
usb-storage yet ... it seemed irrelevant because GART was broken.

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


prarit at redhat

Jul 24, 2008, 8:13 AM

Post #16 of 43 (696 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

> So please tell us what driver you hit the bug with. Can you give us
> the details?
>
>

Sorry Fujita, I had to re-install and reboot the system to get those
details.

The cciss driver is the one that really went haywire for me on the large
system.

It does two greater than PAGE_SIZE pci_alloc_consistent() calls -- one
with 0x3a800 and the other with 0x4800. Both calls seem to complete but
my cciss was completely hosed. I couldn't access it at all :(

The tg3 (which alloc's 0x8000 and 0x4000) actually worked but the
network wasn't working very well.

P.

> Thanks,
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


jbarnes at virtuousgeek

Jul 28, 2008, 3:23 PM

Post #17 of 43 (671 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

On Wednesday, July 23, 2008 4:14 pm FUJITA Tomonori wrote:
> On Thu, 24 Jul 2008 00:10:33 +0200
>
> Joerg Roedel <joro [at] 8bytes> wrote:
> > On Wed, Jul 23, 2008 at 07:19:43AM -0400, Prarit Bhargava wrote:
> > > pci_alloc_consistent/dma_alloc_coherent does not return size aligned
> > > addresses.
> > >
> > > >From Documentation/DMA-mapping.txt:
> > >
> > > "pci_alloc_consistent returns two values: the virtual address which you
> > > can use to access it from the CPU and dma_handle which you pass to the
> > > card.
> > >
> > > The cpu return address and the DMA bus master address are both
> > > guaranteed to be aligned to the smallest PAGE_SIZE order which
> > > is greater than or equal to the requested size. This invariant
> > > exists (for example) to guarantee that if you allocate a chunk
> > > which is smaller than or equal to 64 kilobytes, the extent of the
> > > buffer you receive will not cross a 64K boundary."
> >
> > Interesting. Have you experienced any problems because of that
> > misbehavior in the GART code? AMD IOMMU currently also violates this
> > requirement. I will send a patch to fix that there too.
>
> IIRC, only PARISC and POWER IOMMUs follow the above rule. So I also
> wondered what problem he hit.

Prarit, what's the latest here? The v3 patch I have from you doesn't apply to
my tree but it looks like a good fix. Care to send me a new patch against my
for-linus branch?

Thanks,
Jesse
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


prarit at redhat

Jul 29, 2008, 7:24 AM

Post #18 of 43 (668 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

>
> Prarit, what's the latest here? The v3 patch I have from you doesn't apply to
> my tree but it looks like a good fix. Care to send me a new patch against my
> for-linus branch?
>
>

New patch.
Attachments: upstream3.patch (4.43 KB)


jbarnes at virtuousgeek

Jul 29, 2008, 10:08 AM

Post #19 of 43 (668 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

On Tuesday, July 29, 2008 7:24 am Prarit Bhargava wrote:
> > Prarit, what's the latest here? The v3 patch I have from you doesn't
> > apply to my tree but it looks like a good fix. Care to send me a new
> > patch against my for-linus branch?
>
> New patch.

Thanks Prarit, I pushed this out to my for-linus branch.

Jesse
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


fujita.tomonori at lab

Jul 29, 2008, 5:43 PM

Post #20 of 43 (660 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

On Mon, 28 Jul 2008 15:23:35 -0700
Jesse Barnes <jbarnes [at] virtuousgeek> wrote:

> On Wednesday, July 23, 2008 4:14 pm FUJITA Tomonori wrote:
> > On Thu, 24 Jul 2008 00:10:33 +0200
> >
> > Joerg Roedel <joro [at] 8bytes> wrote:
> > > On Wed, Jul 23, 2008 at 07:19:43AM -0400, Prarit Bhargava wrote:
> > > > pci_alloc_consistent/dma_alloc_coherent does not return size aligned
> > > > addresses.
> > > >
> > > > >From Documentation/DMA-mapping.txt:
> > > >
> > > > "pci_alloc_consistent returns two values: the virtual address which you
> > > > can use to access it from the CPU and dma_handle which you pass to the
> > > > card.
> > > >
> > > > The cpu return address and the DMA bus master address are both
> > > > guaranteed to be aligned to the smallest PAGE_SIZE order which
> > > > is greater than or equal to the requested size. This invariant
> > > > exists (for example) to guarantee that if you allocate a chunk
> > > > which is smaller than or equal to 64 kilobytes, the extent of the
> > > > buffer you receive will not cross a 64K boundary."
> > >
> > > Interesting. Have you experienced any problems because of that
> > > misbehavior in the GART code? AMD IOMMU currently also violates this
> > > requirement. I will send a patch to fix that there too.
> >
> > IIRC, only PARISC and POWER IOMMUs follow the above rule. So I also
> > wondered what problem he hit.
>
> Prarit, what's the latest here? The v3 patch I have from you doesn't apply to
> my tree but it looks like a good fix. Care to send me a new patch against my
> for-linus branch?

I'm not sure how the following cast to 'unsigned long long' fixes
something on X86_64.

> Signed-off-by: Prarit Bhargava <prarit [at] redhat>
>
> diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c
> index 744126e..d3eb527 100644
> --- a/arch/x86/kernel/pci-gart_64.c
> +++ b/arch/x86/kernel/pci-gart_64.c
> @@ -85,7 +85,8 @@ AGPEXTERN __u32 *agp_gatt_table;
> static unsigned long next_bit; /* protected by iommu_bitmap_lock */
> static int need_flush; /* global flush state. set for each gart wrap */
>
> -static unsigned long alloc_iommu(struct device *dev, int size)
> +static unsigned long alloc_iommu(struct device *dev, int size,
> + unsigned long mask)
> {
> unsigned long offset, flags;
> unsigned long boundary_size;
> @@ -93,16 +94,17 @@ static unsigned long alloc_iommu(struct device *dev, int size)
>
> base_index = ALIGN(iommu_bus_base & dma_get_seg_boundary(dev),
> PAGE_SIZE) >> PAGE_SHIFT;
> - boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
> + boundary_size = ALIGN((unsigned long long)dma_get_seg_boundary(dev) + 1,
> PAGE_SIZE) >> PAGE_SHIFT;

I don't think that the following code works since the size is not
always a power of 2.


> @@ -265,7 +268,7 @@ static dma_addr_t dma_map_area(struct device *dev, dma_addr_t phys_mem,
> static dma_addr_t
> gart_map_simple(struct device *dev, phys_addr_t paddr, size_t size, int dir)
> {
> - dma_addr_t map = dma_map_area(dev, paddr, size, dir);
> + dma_addr_t map = dma_map_area(dev, paddr, size, dir, size - 1);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


prarit at redhat

Aug 6, 2008, 5:29 AM

Post #21 of 43 (617 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

FUJITA Tomonori wrote:
> On Mon, 28 Jul 2008 15:23:35 -0700
> Jesse Barnes <jbarnes [at] virtuousgeek> wrote:
>
>
>> On Wednesday, July 23, 2008 4:14 pm FUJITA Tomonori wrote:
>>
>>> On Thu, 24 Jul 2008 00:10:33 +0200
>>>
>>> Joerg Roedel <joro [at] 8bytes> wrote:
>>>
>>>> On Wed, Jul 23, 2008 at 07:19:43AM -0400, Prarit Bhargava wrote:
>>>>
>>>>> pci_alloc_consistent/dma_alloc_coherent does not return size aligned
>>>>> addresses.
>>>>>
>>>>> >From Documentation/DMA-mapping.txt:
>>>>>
>>>>> "pci_alloc_consistent returns two values: the virtual address which you
>>>>> can use to access it from the CPU and dma_handle which you pass to the
>>>>> card.
>>>>>
>>>>> The cpu return address and the DMA bus master address are both
>>>>> guaranteed to be aligned to the smallest PAGE_SIZE order which
>>>>> is greater than or equal to the requested size. This invariant
>>>>> exists (for example) to guarantee that if you allocate a chunk
>>>>> which is smaller than or equal to 64 kilobytes, the extent of the
>>>>> buffer you receive will not cross a 64K boundary."
>>>>>
>>>> Interesting. Have you experienced any problems because of that
>>>> misbehavior in the GART code? AMD IOMMU currently also violates this
>>>> requirement. I will send a patch to fix that there too.
>>>>
>>> IIRC, only PARISC and POWER IOMMUs follow the above rule. So I also
>>> wondered what problem he hit.
>>>
>> Prarit, what's the latest here? The v3 patch I have from you doesn't apply to
>> my tree but it looks like a good fix. Care to send me a new patch against my
>> for-linus branch?
>>
>
> I'm not sure how the following cast to 'unsigned long long' fixes
> something on X86_64.
>
>

You can write a very simple module that kmalloc's a pci_dev, sets up
some trivial values for the dev, and then calls pci_alloc_consistent.
You will panic 100% of the time because 'dma_get_seg_boundary(dev) + 1'
overflows an unsigned long.

>> Signed-off-by: Prarit Bhargava <prarit [at] redhat>
>>
>> diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c
>> index 744126e..d3eb527 100644
>> --- a/arch/x86/kernel/pci-gart_64.c
>> +++ b/arch/x86/kernel/pci-gart_64.c
>> @@ -85,7 +85,8 @@ AGPEXTERN __u32 *agp_gatt_table;
>> static unsigned long next_bit; /* protected by iommu_bitmap_lock */
>> static int need_flush; /* global flush state. set for each gart wrap */
>>
>> -static unsigned long alloc_iommu(struct device *dev, int size)
>> +static unsigned long alloc_iommu(struct device *dev, int size,
>> + unsigned long mask)
>> {
>> unsigned long offset, flags;
>> unsigned long boundary_size;
>> @@ -93,16 +94,17 @@ static unsigned long alloc_iommu(struct device *dev, int size)
>>
>> base_index = ALIGN(iommu_bus_base & dma_get_seg_boundary(dev),
>> PAGE_SIZE) >> PAGE_SHIFT;
>> - boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
>> + boundary_size = ALIGN((unsigned long long)dma_get_seg_boundary(dev) + 1,
>> PAGE_SIZE) >> PAGE_SHIFT;
>>
>
> I don't think that the following code works since the size is not
> always a power of 2.
>



>
>
>> @@ -265,7 +268,7 @@ static dma_addr_t dma_map_area(struct device *dev, dma_addr_t phys_mem,
>> static dma_addr_t
>> gart_map_simple(struct device *dev, phys_addr_t paddr, size_t size, int dir)
>> {
>> - dma_addr_t map = dma_map_area(dev, paddr, size, dir);
>> + dma_addr_t map = dma_map_area(dev, paddr, size, dir, size - 1);
>>

Maybe I'm missing something -- what implies size has to be a power of two?

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


prarit at redhat

Aug 6, 2008, 6:23 AM

Post #22 of 43 (616 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

>>
>>
>>> @@ -265,7 +268,7 @@ static dma_addr_t dma_map_area(struct device
>>> *dev, dma_addr_t phys_mem,
>>> static dma_addr_t
>>> gart_map_simple(struct device *dev, phys_addr_t paddr, size_t size,
>>> int dir)
>>> {
>>> - dma_addr_t map = dma_map_area(dev, paddr, size, dir);
>>> + dma_addr_t map = dma_map_area(dev, paddr, size, dir, size - 1);
>>>
>
> Maybe I'm missing something -- what implies size has to be a power of
> two?
>
> P.
>

Ixnay that last comment -- I geddit (duh.).

The implication of "size-1" is that size is always a power of two.

Jesse -- I'll have to post a follow-up patch to fix this.

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


fujita.tomonori at lab

Aug 6, 2008, 6:35 AM

Post #23 of 43 (620 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

On Wed, 06 Aug 2008 08:29:49 -0400
Prarit Bhargava <prarit [at] redhat> wrote:

>
>
> FUJITA Tomonori wrote:
> > On Mon, 28 Jul 2008 15:23:35 -0700
> > Jesse Barnes <jbarnes [at] virtuousgeek> wrote:
> >
> >
> >> On Wednesday, July 23, 2008 4:14 pm FUJITA Tomonori wrote:
> >>
> >>> On Thu, 24 Jul 2008 00:10:33 +0200
> >>>
> >>> Joerg Roedel <joro [at] 8bytes> wrote:
> >>>
> >>>> On Wed, Jul 23, 2008 at 07:19:43AM -0400, Prarit Bhargava wrote:
> >>>>
> >>>>> pci_alloc_consistent/dma_alloc_coherent does not return size aligned
> >>>>> addresses.
> >>>>>
> >>>>> >From Documentation/DMA-mapping.txt:
> >>>>>
> >>>>> "pci_alloc_consistent returns two values: the virtual address which you
> >>>>> can use to access it from the CPU and dma_handle which you pass to the
> >>>>> card.
> >>>>>
> >>>>> The cpu return address and the DMA bus master address are both
> >>>>> guaranteed to be aligned to the smallest PAGE_SIZE order which
> >>>>> is greater than or equal to the requested size. This invariant
> >>>>> exists (for example) to guarantee that if you allocate a chunk
> >>>>> which is smaller than or equal to 64 kilobytes, the extent of the
> >>>>> buffer you receive will not cross a 64K boundary."
> >>>>>
> >>>> Interesting. Have you experienced any problems because of that
> >>>> misbehavior in the GART code? AMD IOMMU currently also violates this
> >>>> requirement. I will send a patch to fix that there too.
> >>>>
> >>> IIRC, only PARISC and POWER IOMMUs follow the above rule. So I also
> >>> wondered what problem he hit.
> >>>
> >> Prarit, what's the latest here? The v3 patch I have from you doesn't apply to
> >> my tree but it looks like a good fix. Care to send me a new patch against my
> >> for-linus branch?
> >>
> >
> > I'm not sure how the following cast to 'unsigned long long' fixes
> > something on X86_64.
> >
> >
>
> You can write a very simple module that kmalloc's a pci_dev, sets up
> some trivial values for the dev, and then calls pci_alloc_consistent.
> You will panic 100% of the time because 'dma_get_seg_boundary(dev) + 1'
> overflows an unsigned long.

You can't kmalloc pci_dev or setup some trivial values. You need to
use a proper value. The pci code does for us.

Calgary IOMMU has the same code. New AMD IOMMU has the same code too.


> >> Signed-off-by: Prarit Bhargava <prarit [at] redhat>
> >>
> >> diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c
> >> index 744126e..d3eb527 100644
> >> --- a/arch/x86/kernel/pci-gart_64.c
> >> +++ b/arch/x86/kernel/pci-gart_64.c
> >> @@ -85,7 +85,8 @@ AGPEXTERN __u32 *agp_gatt_table;
> >> static unsigned long next_bit; /* protected by iommu_bitmap_lock */
> >> static int need_flush; /* global flush state. set for each gart wrap */
> >>
> >> -static unsigned long alloc_iommu(struct device *dev, int size)
> >> +static unsigned long alloc_iommu(struct device *dev, int size,
> >> + unsigned long mask)
> >> {
> >> unsigned long offset, flags;
> >> unsigned long boundary_size;
> >> @@ -93,16 +94,17 @@ static unsigned long alloc_iommu(struct device *dev, int size)
> >>
> >> base_index = ALIGN(iommu_bus_base & dma_get_seg_boundary(dev),
> >> PAGE_SIZE) >> PAGE_SHIFT;
> >> - boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
> >> + boundary_size = ALIGN((unsigned long long)dma_get_seg_boundary(dev) + 1,
> >> PAGE_SIZE) >> PAGE_SHIFT;
> >>
> >
> > I don't think that the following code works since the size is not
> > always a power of 2.
> >
>
>
>
> >
> >
> >> @@ -265,7 +268,7 @@ static dma_addr_t dma_map_area(struct device *dev, dma_addr_t phys_mem,
> >> static dma_addr_t
> >> gart_map_simple(struct device *dev, phys_addr_t paddr, size_t size, int dir)
> >> {
> >> - dma_addr_t map = dma_map_area(dev, paddr, size, dir);
> >> + dma_addr_t map = dma_map_area(dev, paddr, size, dir, size - 1);
> >>
>
> Maybe I'm missing something -- what implies size has to be a power of two?

Yes, see iommu_area_alloc().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


prarit at redhat

Aug 6, 2008, 7:32 AM

Post #24 of 43 (616 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

FUJITA Tomonori wrote:
> On Wed, 06 Aug 2008 08:29:49 -0400
> Prarit Bhargava <prarit [at] redhat> wrote:
>
>
>> FUJITA Tomonori wrote:
>>
>>> On Mon, 28 Jul 2008 15:23:35 -0700
>>> Jesse Barnes <jbarnes [at] virtuousgeek> wrote:
>>>
>>>
>>>
>>>> On Wednesday, July 23, 2008 4:14 pm FUJITA Tomonori wrote:
>>>>
>>>>
>>>>> On Thu, 24 Jul 2008 00:10:33 +0200
>>>>>
>>>>> Joerg Roedel <joro [at] 8bytes> wrote:
>>>>>
>>>>>
>>>>>> On Wed, Jul 23, 2008 at 07:19:43AM -0400, Prarit Bhargava wrote:
>>>>>>
>>>>>>
>>>>>>> pci_alloc_consistent/dma_alloc_coherent does not return size aligned
>>>>>>> addresses.
>>>>>>>
>>>>>>> >From Documentation/DMA-mapping.txt:
>>>>>>>
>>>>>>> "pci_alloc_consistent returns two values: the virtual address which you
>>>>>>> can use to access it from the CPU and dma_handle which you pass to the
>>>>>>> card.
>>>>>>>
>>>>>>> The cpu return address and the DMA bus master address are both
>>>>>>> guaranteed to be aligned to the smallest PAGE_SIZE order which
>>>>>>> is greater than or equal to the requested size. This invariant
>>>>>>> exists (for example) to guarantee that if you allocate a chunk
>>>>>>> which is smaller than or equal to 64 kilobytes, the extent of the
>>>>>>> buffer you receive will not cross a 64K boundary."
>>>>>>>
>>>>>>>
>>>>>> Interesting. Have you experienced any problems because of that
>>>>>> misbehavior in the GART code? AMD IOMMU currently also violates this
>>>>>> requirement. I will send a patch to fix that there too.
>>>>>>
>>>>>>
>>>>> IIRC, only PARISC and POWER IOMMUs follow the above rule. So I also
>>>>> wondered what problem he hit.
>>>>>
>>>>>
>>>> Prarit, what's the latest here? The v3 patch I have from you doesn't apply to
>>>> my tree but it looks like a good fix. Care to send me a new patch against my
>>>> for-linus branch?
>>>>
>>>>
>>> I'm not sure how the following cast to 'unsigned long long' fixes
>>> something on X86_64.
>>>
>>>
>>>
>> You can write a very simple module that kmalloc's a pci_dev, sets up
>> some trivial values for the dev, and then calls pci_alloc_consistent.
>> You will panic 100% of the time because 'dma_get_seg_boundary(dev) + 1'
>> overflows an unsigned long.
>>
>
> You can't kmalloc pci_dev or setup some trivial values. You need to
> use a proper value. The pci code does for us.
>

Oops -- I meant struct device, not struct pci_dev.

Anwyay, Jesse -- is this true? I can no longer do something like:


static struct device junk_dev = {
.bus_id = "junk device",
.coherent_dma_mask = 0xffffffff,
.dma_mask = &junk_dev.coherent_dma_mask,
};

And then use that as the device target for dma_alloc_coherent? AFAIK,
that has always worked for me.

Anyhoo -- it is possible that dma_get_seg_boundary returns 0xffffffff.
Add one to that. You overflow.

> Calgary IOMMU has the same code. New AMD IOMMU has the same code too.
>
>

Then they don't handle the above problem and are broken when
dma_get_seg_boundary() returns 0xffffffff and will require patches.

/me hasn't tried out Calgary of AMD IOMMU.

>
>>>> Signed-off-by: Prarit Bhargava <prarit [at] redhat>
>>>>
>>>> diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c
>>>> index 744126e..d3eb527 100644
>>>> --- a/arch/x86/kernel/pci-gart_64.c
>>>> +++ b/arch/x86/kernel/pci-gart_64.c
>>>> @@ -85,7 +85,8 @@ AGPEXTERN __u32 *agp_gatt_table;
>>>> static unsigned long next_bit; /* protected by iommu_bitmap_lock */
>>>> static int need_flush; /* global flush state. set for each gart wrap */
>>>>
>>>> -static unsigned long alloc_iommu(struct device *dev, int size)
>>>> +static unsigned long alloc_iommu(struct device *dev, int size,
>>>> + unsigned long mask)
>>>> {
>>>> unsigned long offset, flags;
>>>> unsigned long boundary_size;
>>>> @@ -93,16 +94,17 @@ static unsigned long alloc_iommu(struct device *dev, int size)
>>>>
>>>> base_index = ALIGN(iommu_bus_base & dma_get_seg_boundary(dev),
>>>> PAGE_SIZE) >> PAGE_SHIFT;
>>>> - boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
>>>> + boundary_size = ALIGN((unsigned long long)dma_get_seg_boundary(dev) + 1,
>>>> PAGE_SIZE) >> PAGE_SHIFT;
>>>>
>>>>
>>> I don't think that the following code works since the size is not
>>> always a power of 2.
>>>
>>>
>>
>>
>>>
>>>
>>>> @@ -265,7 +268,7 @@ static dma_addr_t dma_map_area(struct device *dev, dma_addr_t phys_mem,
>>>> static dma_addr_t
>>>> gart_map_simple(struct device *dev, phys_addr_t paddr, size_t size, int dir)
>>>> {
>>>> - dma_addr_t map = dma_map_area(dev, paddr, size, dir);
>>>> + dma_addr_t map = dma_map_area(dev, paddr, size, dir, size - 1);
>>>>
>>>>
>> Maybe I'm missing something -- what implies size has to be a power of two?
>>
>
> Yes, see iommu_area_alloc().
>

/me looks and still doesn't see where the size passed into
gart_map_simple() must be a power of two. ... and if that was the case,
shouldn't we be failing all the time? I mean, I've seen values passed
into pci_alloc_consistent like 0x3820 -- clearly not a multiple of 2.

iommu_area_alloc() deals with pages, not actual sizes as
gart_map_simple() does.

If anything, I would make this simple fix:

dma_addr_t map = dma_map_area(dev, paddr, size, dir, size - 1);

should be

dma_addr_t map = dma_map_area(dev, paddr, size, dir, size);

because after my patch we round up the mask argument to get the correct
alignment to # of pages anyway.

P.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


jbarnes at virtuousgeek

Aug 7, 2008, 10:03 AM

Post #25 of 43 (607 views)
Permalink
Re: [PATCH]: PCI: GART iommu alignment fixes [v2] [In reply to]

On Wednesday, August 6, 2008 7:32 am Prarit Bhargava wrote:
> > You can't kmalloc pci_dev or setup some trivial values. You need to
> > use a proper value. The pci code does for us.
>
> Oops -- I meant struct device, not struct pci_dev.
>
> Anwyay, Jesse -- is this true? I can no longer do something like:
>
>
> static struct device junk_dev = {
> .bus_id = "junk device",
> .coherent_dma_mask = 0xffffffff,
> .dma_mask = &junk_dev.coherent_dma_mask,
> };
>
> And then use that as the device target for dma_alloc_coherent? AFAIK,
> that has always worked for me.

It gets dangerous since platforms are in control of some pci_dev and dev
fields, and if they don't get initialized you can get into trouble.

> > Calgary IOMMU has the same code. New AMD IOMMU has the same code too.
>
> Then they don't handle the above problem and are broken when
> dma_get_seg_boundary() returns 0xffffffff and will require patches.
>
> /me hasn't tried out Calgary of AMD IOMMU.

Would be good to find someone to do some testing on one of those platforms...

> >> Maybe I'm missing something -- what implies size has to be a power of
> >> two?
> >
> > Yes, see iommu_area_alloc().
>
> /me looks and still doesn't see where the size passed into
> gart_map_simple() must be a power of two. ... and if that was the case,
> shouldn't we be failing all the time? I mean, I've seen values passed
> into pci_alloc_consistent like 0x3820 -- clearly not a multiple of 2.
>
> iommu_area_alloc() deals with pages, not actual sizes as
> gart_map_simple() does.
>
> If anything, I would make this simple fix:
>
> dma_addr_t map = dma_map_area(dev, paddr, size, dir, size - 1);
>
> should be
>
> dma_addr_t map = dma_map_area(dev, paddr, size, dir, size);
>
> because after my patch we round up the mask argument to get the correct
> alignment to # of pages anyway.

Feel like respinning with a full changelog against my for-linus branch? Maybe
you can convince Tomonori-san this time. :)

Jesse
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First page Previous page 1 2 Next page Last page  View All Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.