Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Xen: Devel

Load increase after memory upgrade (part2)

 

 

First page Previous page 1 2 3 Next page Last page  View All Xen devel RSS feed   Index | Next | Previous | View Threaded


carsten at schiers

Dec 16, 2011, 6:56 AM

Post #26 of 66 (650 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

Well, it will do nothing but print out “SWIOTLB is 0% full”.

 
Does that help? Or do you think something went wrong with the patch…

 
BR,

Carsten.

 
 
 
Von: Carsten Schiers
Gesendet: Donnerstag, 15. Dezember 2011 15:53
An: Konrad Rzeszutek Wilk; Konrad Rzeszutek Wilk
Cc: linux [at] eikelenboom; zhenzhong.duan [at] oracle; Ian Campbell; lersek [at] redhat; xen-devel
Betreff: AW: [Xen-devel] Load increase after memory upgrade (part2)

 
...

> which will require some fiddling around.

Here is the patch I used against classic XenLinux. Any chance you could run
it with your classis guests and see what numbers you get?

Sure, it might take a bit, but I'll try it with my 2.6.34 classic kernel.

 
Carsten.


konrad.wilk at oracle

Dec 16, 2011, 7:04 AM

Post #27 of 66 (646 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Fri, Dec 16, 2011 at 03:56:10PM +0100, Carsten Schiers wrote:
> Well, it will do nothing but print out “SWIOTLB is 0% full”.
>
>  
> Does that help? Or do you think something went wrong with the patch…
>

And you are using swiotlb=force on the 2.6.34 classic kernel and passing
in your budget-av card in it? Could you append the dmesg output please?


Thanks.
>  
> BR,
>
> Carsten.
>
>  
>  
>  
> Von: Carsten Schiers
> Gesendet: Donnerstag, 15. Dezember 2011 15:53
> An: Konrad Rzeszutek Wilk; Konrad Rzeszutek Wilk
> Cc: linux [at] eikelenboom; zhenzhong.duan [at] oracle; Ian Campbell; lersek [at] redhat; xen-devel
> Betreff: AW: [Xen-devel] Load increase after memory upgrade (part2)
>
>  
> ...
>
> > which will require some fiddling around.
>
> Here is the patch I used against classic XenLinux. Any chance you could run
> it with your classis guests and see what numbers you get?
>
> Sure, it might take a bit, but I'll try it with my 2.6.34 classic kernel.
>
>  
> Carsten.
>

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


carsten at schiers

Dec 16, 2011, 7:51 AM

Post #28 of 66 (644 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

> And you are using swiotlb=force on the 2.6.34 classic kernel and passing in your budget-av card in it?

Yes, two of them with swiotlb=32,force.


> Could you append the dmesg output please?

Attached. You find a "normal" boot after the one with the patched kernel.

Carsten.
Attachments: dmesg.txt (30.1 KB)


konrad.wilk at oracle

Dec 16, 2011, 8:19 AM

Post #29 of 66 (650 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Fri, Dec 16, 2011 at 04:51:47PM +0100, Carsten Schiers wrote:
> > And you are using swiotlb=force on the 2.6.34 classic kernel and passing in your budget-av card in it?
>
> Yes, two of them with swiotlb=32,force.
>
>
> > Could you append the dmesg output please?
>
> Attached. You find a "normal" boot after the one with the patched kernel.

Uh, what happens when you run the driver, meaning capture stuff. I remember with
the pvops you had about ~30K or so of bounces, but not sure about the bootup?

Thanks for being willing to be a guinea pig while trying to fix this.
>
> Carsten.
>
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


carsten at schiers

Dec 17, 2011, 2:12 PM

Post #30 of 66 (649 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

OK, double checked. Both PCI cards enabled, running, working, but nothing but "SWIOTLB is 0% full". Any chance
to check that the patch is working? Does it print out something else with your setting? BR, Carsten.

-----Ursprngliche Nachricht-----
Von: xen-devel-bounces [at] lists [mailto:xen-devel-bounces [at] lists] Im Auftrag von Konrad Rzeszutek Wilk
Gesendet: Freitag, 16. Dezember 2011 17:19
An: Carsten Schiers
Cc: linux [at] eikelenboom; xen-devel; lersek [at] redhat; zhenzhong.duan [at] oracle; Ian Campbell
Betreff: Re: [Xen-devel] Load increase after memory upgrade (part2)

On Fri, Dec 16, 2011 at 04:51:47PM +0100, Carsten Schiers wrote:
> > And you are using swiotlb=force on the 2.6.34 classic kernel and passing in your budget-av card in it?
>
> Yes, two of them with swiotlb=32,force.
>
>
> > Could you append the dmesg output please?
>
> Attached. You find a "normal" boot after the one with the patched kernel.

Uh, what happens when you run the driver, meaning capture stuff. I remember with the pvops you had about ~30K or so of bounces, but not sure about the bootup?

Thanks for being willing to be a guinea pig while trying to fix this.
>
> Carsten.
>
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


linux at eikelenboom

Dec 17, 2011, 4:19 PM

Post #31 of 66 (649 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

I also have done some experiments with the patch, in domU i also get the 0% full for my usb controllers with video grabbers , in dom0 my i get 12% full, both my realtek 8169 ethernet controllers seem to use the bounce buffering ...
And that with a iommu (amd) ? it all seems kind of strange, although it is also working ...
I'm not having much time now, hoping to get back with a full report soon.

--
Sander

Saturday, December 17, 2011, 11:12:45 PM, you wrote:

> OK, double checked. Both PCI cards enabled, running, working, but nothing but "SWIOTLB is 0% full". Any chance
> to check that the patch is working? Does it print out something else with your setting? BR, Carsten.

> -----Ursprngliche Nachricht-----
> Von: xen-devel-bounces [at] lists [mailto:xen-devel-bounces [at] lists] Im Auftrag von Konrad Rzeszutek Wilk
> Gesendet: Freitag, 16. Dezember 2011 17:19
> An: Carsten Schiers
> Cc: linux [at] eikelenboom; xen-devel; lersek [at] redhat; zhenzhong.duan [at] oracle; Ian Campbell
> Betreff: Re: [Xen-devel] Load increase after memory upgrade (part2)

> On Fri, Dec 16, 2011 at 04:51:47PM +0100, Carsten Schiers wrote:
>> > And you are using swiotlb=force on the 2.6.34 classic kernel and passing in your budget-av card in it?
>>
>> Yes, two of them with swiotlb=32,force.
>>
>>
>> > Could you append the dmesg output please?
>>
>> Attached. You find a "normal" boot after the one with the patched kernel.

> Uh, what happens when you run the driver, meaning capture stuff. I remember with the pvops you had about ~30K or so of bounces, but not sure about the bootup?

> Thanks for being willing to be a guinea pig while trying to fix this.
>>
>> Carsten.
>>
>>



> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xensource.com/xen-devel




--
Best regards,
Sander mailto:linux [at] eikelenboom


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad at darnok

Dec 19, 2011, 6:54 AM

Post #32 of 66 (649 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Sat, Dec 17, 2011 at 11:12:45PM +0100, Carsten Schiers wrote:
> OK, double checked. Both PCI cards enabled, running, working, but nothing but "SWIOTLB is 0% full". Any chance
> to check that the patch is working? Does it print out something else with your setting? BR, Carsten.

Hm, and with the pvops you got some numbers along with tons of 'bounce'.

The one thing that I neglected in this patch is the alloc_coherent
part.. which I don't thing is that important as we did show that the
alloc buffers are used.

I don't have anything concrete yet, but after the holidays should have a
better idea of what is happening. Thanks for being willing to test
this!
>
> -----Urspr?ngliche Nachricht-----
> Von: xen-devel-bounces [at] lists [mailto:xen-devel-bounces [at] lists] Im Auftrag von Konrad Rzeszutek Wilk
> Gesendet: Freitag, 16. Dezember 2011 17:19
> An: Carsten Schiers
> Cc: linux [at] eikelenboom; xen-devel; lersek [at] redhat; zhenzhong.duan [at] oracle; Ian Campbell
> Betreff: Re: [Xen-devel] Load increase after memory upgrade (part2)
>
> On Fri, Dec 16, 2011 at 04:51:47PM +0100, Carsten Schiers wrote:
> > > And you are using swiotlb=force on the 2.6.34 classic kernel and passing in your budget-av card in it?
> >
> > Yes, two of them with swiotlb=32,force.
> >
> >
> > > Could you append the dmesg output please?
> >
> > Attached. You find a "normal" boot after the one with the patched kernel.
>
> Uh, what happens when you run the driver, meaning capture stuff. I remember with the pvops you had about ~30K or so of bounces, but not sure about the bootup?
>
> Thanks for being willing to be a guinea pig while trying to fix this.
> >
> > Carsten.
> >
> >
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xensource.com/xen-devel
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad at darnok

Dec 19, 2011, 6:56 AM

Post #33 of 66 (647 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Sun, Dec 18, 2011 at 01:19:16AM +0100, Sander Eikelenboom wrote:
> I also have done some experiments with the patch, in domU i also get the 0% full for my usb controllers with video grabbers , in dom0 my i get 12% full, both my realtek 8169 ethernet controllers seem to use the bounce buffering ...
> And that with a iommu (amd) ? it all seems kind of strange, although it is also working ...
> I'm not having much time now, hoping to get back with a full report soon.

Hm, so domU nothing, but dom0 it reports. Maybe the patch is incorrect
when running as PV guest .. Will look in more details after the
holidays. Thanks for being willing to try it out.

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad.wilk at oracle

Jan 10, 2012, 1:55 PM

Post #34 of 66 (633 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Mon, Dec 19, 2011 at 10:56:09AM -0400, Konrad Rzeszutek Wilk wrote:
> On Sun, Dec 18, 2011 at 01:19:16AM +0100, Sander Eikelenboom wrote:
> > I also have done some experiments with the patch, in domU i also get the 0% full for my usb controllers with video grabbers , in dom0 my i get 12% full, both my realtek 8169 ethernet controllers seem to use the bounce buffering ...
> > And that with a iommu (amd) ? it all seems kind of strange, although it is also working ...
> > I'm not having much time now, hoping to get back with a full report soon.
>
> Hm, so domU nothing, but dom0 it reports. Maybe the patch is incorrect
> when running as PV guest .. Will look in more details after the
> holidays. Thanks for being willing to try it out.

Good news is I am able to reproduce this with my 32-bit NIC with 3.2 domU:

[ 771.896140] SWIOTLB is 11% full
[ 776.896116] 0 [e1000 0000:00:00.0] bounce: from:222028(slow:0)to:2 map:222037 unmap:227220 sync:0
[ 776.896126] 1 [e1000 0000:00:00.0] bounce: from:0(slow:0)to:5188 map:5188 unmap:0 sync:0
[ 776.896133] 3 [e1000 0000:00:00.0] bounce: from:0(slow:0)to:1 map:1 unmap:0 sync:0

but interestingly enough, if I boot the guest as the first one I do not get these bounce
requests. I will shortly bootup a Xen-O-Linux kernel and see if I get these same
numbers.


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


linux at eikelenboom

Jan 12, 2012, 2:06 PM

Post #35 of 66 (630 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

Hello Konrad,

Tuesday, January 10, 2012, 10:55:33 PM, you wrote:

> On Mon, Dec 19, 2011 at 10:56:09AM -0400, Konrad Rzeszutek Wilk wrote:
>> On Sun, Dec 18, 2011 at 01:19:16AM +0100, Sander Eikelenboom wrote:
>> > I also have done some experiments with the patch, in domU i also get the 0% full for my usb controllers with video grabbers , in dom0 my i get 12% full, both my realtek 8169 ethernet controllers seem to use the bounce buffering ...
>> > And that with a iommu (amd) ? it all seems kind of strange, although it is also working ...
>> > I'm not having much time now, hoping to get back with a full report soon.
>>
>> Hm, so domU nothing, but dom0 it reports. Maybe the patch is incorrect
>> when running as PV guest .. Will look in more details after the
>> holidays. Thanks for being willing to try it out.

> Good news is I am able to reproduce this with my 32-bit NIC with 3.2 domU:

> [ 771.896140] SWIOTLB is 11% full
> [ 776.896116] 0 [e1000 0000:00:00.0] bounce: from:222028(slow:0)to:2 map:222037 unmap:227220 sync:0
> [ 776.896126] 1 [e1000 0000:00:00.0] bounce: from:0(slow:0)to:5188 map:5188 unmap:0 sync:0
> [ 776.896133] 3 [e1000 0000:00:00.0] bounce: from:0(slow:0)to:1 map:1 unmap:0 sync:0

> but interestingly enough, if I boot the guest as the first one I do not get these bounce
> requests. I will shortly bootup a Xen-O-Linux kernel and see if I get these same
> numbers.


I started to expiriment some more with what i encountered.

On dom0 i was seeing that my r8169 ethernet controllers where using bounce buffering with the dump-swiotlb module.
It was showing "12% full".
Checking in sysfs shows:
serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat consistent_dma_mask_bits
32
serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat dma_mask_bits
32

If i remember correctly wasn't the allocation for dom0 changed to be to the top of memory instead of low .. somewhere between 2.6.32 and 3.0 ?
Could that change cause the need for all devices to need bounce buffering and could it therefore explain some people seeing more cpu usage for dom0 ?

I have forced my r8169 to use 64bits dma mask (using use_dac=1)
serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat consistent_dma_mask_bits
32
serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat dma_mask_bits
64

This results in dump-swiotlb reporting:

[ 1265.616106] 0 [r8169 0000:09:00.0] bounce: from:5(slow:0)to:0 map:0 unmap:0 sync:10
[ 1265.625043] SWIOTLB is 0% full
[ 1270.626085] 0 [r8169 0000:08:00.0] bounce: from:6(slow:0)to:0 map:0 unmap:0 sync:12
[ 1270.635024] SWIOTLB is 0% full
[ 1275.635091] 0 [r8169 0000:09:00.0] bounce: from:5(slow:0)to:0 map:0 unmap:0 sync:10
[ 1275.644261] SWIOTLB is 0% full
[ 1280.654097] 0 [r8169 0000:09:00.0] bounce: from:5(slow:0)to:0 map:0 unmap:0 sync:10



So it has changed from 12% to 0%, although it still reports something about bouncing ? or am i mis interpreting stuff ?


Another thing i was wondering about, couldn't the hypervisor offer a small window in 32bit addressable mem to all (or only when pci passthrough is used) domU's to be used for DMA ?

(oh yes, i haven't got i clue what i'm talking about ... so it probably make no sense at all :-) )


--
Sander




_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at suse

Jan 13, 2012, 12:12 AM

Post #36 of 66 (632 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

>>> On 12.01.12 at 23:06, Sander Eikelenboom <linux [at] eikelenboom> wrote:
> Another thing i was wondering about, couldn't the hypervisor offer a small
> window in 32bit addressable mem to all (or only when pci passthrough is used)
> domU's to be used for DMA ?

How would use of such a range be arbitrated/protected? You'd have to
ask for reservation (aka allocation) of a chunk anyway, which is as good
as using the existing interfaces to obtain address restricted memory
(and the hypervisor has a [rudimentary] mechanism to preserve some
low memory for DMA allocations).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad.wilk at oracle

Jan 13, 2012, 7:13 AM

Post #37 of 66 (633 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

> >> > I also have done some experiments with the patch, in domU i also get the 0% full for my usb controllers with video grabbers , in dom0 my i get 12% full, both my realtek 8169 ethernet controllers seem to use the bounce buffering ...
> >> > And that with a iommu (amd) ? it all seems kind of strange, although it is also working ...
> >> > I'm not having much time now, hoping to get back with a full report soon.
> >>
> >> Hm, so domU nothing, but dom0 it reports. Maybe the patch is incorrect
> >> when running as PV guest .. Will look in more details after the
> >> holidays. Thanks for being willing to try it out.
>
> > Good news is I am able to reproduce this with my 32-bit NIC with 3.2 domU:
>
> > [ 771.896140] SWIOTLB is 11% full
> > [ 776.896116] 0 [e1000 0000:00:00.0] bounce: from:222028(slow:0)to:2 map:222037 unmap:227220 sync:0
> > [ 776.896126] 1 [e1000 0000:00:00.0] bounce: from:0(slow:0)to:5188 map:5188 unmap:0 sync:0
> > [ 776.896133] 3 [e1000 0000:00:00.0] bounce: from:0(slow:0)to:1 map:1 unmap:0 sync:0
>
> > but interestingly enough, if I boot the guest as the first one I do not get these bounce
> > requests. I will shortly bootup a Xen-O-Linux kernel and see if I get these same
> > numbers.
>
>
> I started to expiriment some more with what i encountered.
>
> On dom0 i was seeing that my r8169 ethernet controllers where using bounce buffering with the dump-swiotlb module.
> It was showing "12% full".
> Checking in sysfs shows:
> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat consistent_dma_mask_bits
> 32
> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat dma_mask_bits
> 32
>
> If i remember correctly wasn't the allocation for dom0 changed to be to the top of memory instead of low .. somewhere between 2.6.32 and 3.0 ?

? We never actually had dom0 support in the upstream kernel until 2.6.37.. The 2.6.32<->2.6.36 you are
referring to must have been the trees that I spun up - but the implementation of SWIOTLB in them
had not really changed.

> Could that change cause the need for all devices to need bounce buffering and could it therefore explain some people seeing more cpu usage for dom0 ?

The issue I am seeing is not CPU usage in dom0, but rather the CPU usage in domU with guests.
And that the older domU's (XenOLinux) do not have this.

That I can't understand - the implementation in both cases _looks_ to do the same thing.
There was one issue I found in the upstream one, but even with that fix I still
get that "bounce" usage in domU.

Interestingly enough, I get that only if I have launched, destroyed, launched, etc, the guest multiple
times before I get this. Which leads me to believe this is not a kernel issue but that we
are simply fragmented the Xen memory so much, so that when it launches the guest all of the
memory is above 4GB. But that seems counter-intuive as by default Xen starts guests at the far end of
memory (so on my 16GB box it would stick a 4GB guest at 12GB->16GB roughly). The SWIOTLB
swizzles some memory under the 4GB , and this is where we get the bounce buffer effect
(as the memory from 4GB is then copied to the memory 12GB->16GB).

But it does not explain why on the first couple of starts I did not see this with pvops.
And it does not seem to happen with the XenOLinux kernel, so there must be something else
in here.

>
> I have forced my r8169 to use 64bits dma mask (using use_dac=1)

Ah yes.
> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat consistent_dma_mask_bits
> 32
> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat dma_mask_bits
> 64
>
> This results in dump-swiotlb reporting:
>
> [ 1265.616106] 0 [r8169 0000:09:00.0] bounce: from:5(slow:0)to:0 map:0 unmap:0 sync:10
> [ 1265.625043] SWIOTLB is 0% full
> [ 1270.626085] 0 [r8169 0000:08:00.0] bounce: from:6(slow:0)to:0 map:0 unmap:0 sync:12
> [ 1270.635024] SWIOTLB is 0% full
> [ 1275.635091] 0 [r8169 0000:09:00.0] bounce: from:5(slow:0)to:0 map:0 unmap:0 sync:10
> [ 1275.644261] SWIOTLB is 0% full
> [ 1280.654097] 0 [r8169 0000:09:00.0] bounce: from:5(slow:0)to:0 map:0 unmap:0 sync:10

Which is what we expect. No need to bounce since the PCI adapter can reach memory
above the 4GB mark.

>
>
>
> So it has changed from 12% to 0%, although it still reports something about bouncing ? or am i mis interpreting stuff ?

The bouncing can happen due to two cases:
- Memory is above 4GB
- Memory crosses a page-boundary (rarely happens).
>
>
> Another thing i was wondering about, couldn't the hypervisor offer a small window in 32bit addressable mem to all (or only when pci passthrough is used) domU's to be used for DMA ?

It does. That is what the Xen SWIOTLB does with "swizzling" the pages in its pool.
But it can't do it for every part of memory. That is why there are DMA pools
which are used by graphics adapters, video capture devices,storage and network
drivers. They are used for small packet sizes so that the driver does not have
to allocate DMA buffers when it gets a 100bytes ping response. But for large
packets (say that ISO file you are downloading) it allocates memory on the fly
and "maps" it into the PCI space using the DMA API. That "mapping" sets up
an "physical memory" -> "guest memory" translation - and if that allocated
memory is above 4GB, part of this mapping is to copy ("bounce") the memory
under the 4GB (where XenSWIOTLB has allocated a pool), so that the adapter
can physically fetch/put the data. Once that is completed it is "sync"-ed
back, which is bouncing that data to the "allocated memory".

So having a DMA pool is very good - and most drivers use it. The thing I can't
figure out is:
- why the DVB do not seem to use it, even thought they look to use the videobuf_dma
driver.
- why the XenOLinux does not seem to have this problem (and this might be false -
perhaps it does have this problem and it just takes a couple of guest launches,
destructions, starts, etc to actually see it).
- are there any flags in the domain builder to say: "ok, this domain is going to
service 32-bit cards, hence build the memory from 0->4GB". This seems like
a good know at first, but it probably is a bad idea (imagine using it by mistake
on every guest). And also nowadays most cards are PCIe and they can do 64-bit, so
it would not be that important in the future.
>
> (oh yes, i haven't got i clue what i'm talking about ... so it probably make no sense at all :-) )

Nonsense. You were on the correct path . Hopefully the level of details hasn't
scared you off now :-)

>
>
> --
> Sander
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


linux at eikelenboom

Jan 15, 2012, 3:32 AM

Post #38 of 66 (633 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

Friday, January 13, 2012, 4:13:07 PM, you wrote:

>> >> > I also have done some experiments with the patch, in domU i also get the 0% full for my usb controllers with video grabbers , in dom0 my i get 12% full, both my realtek 8169 ethernet controllers seem to use the bounce buffering ...
>> >> > And that with a iommu (amd) ? it all seems kind of strange, although it is also working ...
>> >> > I'm not having much time now, hoping to get back with a full report soon.
>> >>
>> >> Hm, so domU nothing, but dom0 it reports. Maybe the patch is incorrect
>> >> when running as PV guest .. Will look in more details after the
>> >> holidays. Thanks for being willing to try it out.
>>
>> > Good news is I am able to reproduce this with my 32-bit NIC with 3.2 domU:
>>
>> > [ 771.896140] SWIOTLB is 11% full
>> > [ 776.896116] 0 [e1000 0000:00:00.0] bounce: from:222028(slow:0)to:2 map:222037 unmap:227220 sync:0
>> > [ 776.896126] 1 [e1000 0000:00:00.0] bounce: from:0(slow:0)to:5188 map:5188 unmap:0 sync:0
>> > [ 776.896133] 3 [e1000 0000:00:00.0] bounce: from:0(slow:0)to:1 map:1 unmap:0 sync:0
>>
>> > but interestingly enough, if I boot the guest as the first one I do not get these bounce
>> > requests. I will shortly bootup a Xen-O-Linux kernel and see if I get these same
>> > numbers.
>>
>>
>> I started to expiriment some more with what i encountered.
>>
>> On dom0 i was seeing that my r8169 ethernet controllers where using bounce buffering with the dump-swiotlb module.
>> It was showing "12% full".
>> Checking in sysfs shows:
>> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat consistent_dma_mask_bits
>> 32
>> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat dma_mask_bits
>> 32
>>
>> If i remember correctly wasn't the allocation for dom0 changed to be to the top of memory instead of low .. somewhere between 2.6.32 and 3.0 ?

> ? We never actually had dom0 support in the upstream kernel until 2.6.37.. The 2.6.32<->2.6.36 you are
> referring to must have been the trees that I spun up - but the implementation of SWIOTLB in them
> had not really changed.

>> Could that change cause the need for all devices to need bounce buffering and could it therefore explain some people seeing more cpu usage for dom0 ?

> The issue I am seeing is not CPU usage in dom0, but rather the CPU usage in domU with guests.
> And that the older domU's (XenOLinux) do not have this.

> That I can't understand - the implementation in both cases _looks_ to do the same thing.
> There was one issue I found in the upstream one, but even with that fix I still
> get that "bounce" usage in domU.

> Interestingly enough, I get that only if I have launched, destroyed, launched, etc, the guest multiple
> times before I get this. Which leads me to believe this is not a kernel issue but that we
> are simply fragmented the Xen memory so much, so that when it launches the guest all of the
> memory is above 4GB. But that seems counter-intuive as by default Xen starts guests at the far end of
> memory (so on my 16GB box it would stick a 4GB guest at 12GB->16GB roughly). The SWIOTLB
> swizzles some memory under the 4GB , and this is where we get the bounce buffer effect
> (as the memory from 4GB is then copied to the memory 12GB->16GB).

> But it does not explain why on the first couple of starts I did not see this with pvops.
> And it does not seem to happen with the XenOLinux kernel, so there must be something else
> in here.

>>
>> I have forced my r8169 to use 64bits dma mask (using use_dac=1)

> Ah yes.
>> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat consistent_dma_mask_bits
>> 32
>> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat dma_mask_bits
>> 64
>>
>> This results in dump-swiotlb reporting:
>>
>> [ 1265.616106] 0 [r8169 0000:09:00.0] bounce: from:5(slow:0)to:0 map:0 unmap:0 sync:10
>> [ 1265.625043] SWIOTLB is 0% full
>> [ 1270.626085] 0 [r8169 0000:08:00.0] bounce: from:6(slow:0)to:0 map:0 unmap:0 sync:12
>> [ 1270.635024] SWIOTLB is 0% full
>> [ 1275.635091] 0 [r8169 0000:09:00.0] bounce: from:5(slow:0)to:0 map:0 unmap:0 sync:10
>> [ 1275.644261] SWIOTLB is 0% full
>> [ 1280.654097] 0 [r8169 0000:09:00.0] bounce: from:5(slow:0)to:0 map:0 unmap:0 sync:10

> Which is what we expect. No need to bounce since the PCI adapter can reach memory
> above the 4GB mark.

>>
>>
>>
>> So it has changed from 12% to 0%, although it still reports something about bouncing ? or am i mis interpreting stuff ?

> The bouncing can happen due to two cases:
> - Memory is above 4GB
> - Memory crosses a page-boundary (rarely happens).
>>
>>
>> Another thing i was wondering about, couldn't the hypervisor offer a small window in 32bit addressable mem to all (or only when pci passthrough is used) domU's to be used for DMA ?

> It does. That is what the Xen SWIOTLB does with "swizzling" the pages in its pool.
> But it can't do it for every part of memory. That is why there are DMA pools
> which are used by graphics adapters, video capture devices,storage and network
> drivers. They are used for small packet sizes so that the driver does not have
> to allocate DMA buffers when it gets a 100bytes ping response. But for large
> packets (say that ISO file you are downloading) it allocates memory on the fly
> and "maps" it into the PCI space using the DMA API. That "mapping" sets up
> an "physical memory" -> "guest memory" translation - and if that allocated
> memory is above 4GB, part of this mapping is to copy ("bounce") the memory
> under the 4GB (where XenSWIOTLB has allocated a pool), so that the adapter
> can physically fetch/put the data. Once that is completed it is "sync"-ed
> back, which is bouncing that data to the "allocated memory".


> So having a DMA pool is very good - and most drivers use it. The thing I can't
> figure out is:
> - why the DVB do not seem to use it, even thought they look to use the videobuf_dma
> driver.
> - why the XenOLinux does not seem to have this problem (and this might be false -
> perhaps it does have this problem and it just takes a couple of guest launches,
> destructions, starts, etc to actually see it).
> - are there any flags in the domain builder to say: "ok, this domain is going to
> service 32-bit cards, hence build the memory from 0->4GB". This seems like
> a good know at first, but it probably is a bad idea (imagine using it by mistake
> on every guest). And also nowadays most cards are PCIe and they can do 64-bit, so
> it would not be that important in the future.
>>
>> (oh yes, i haven't got i clue what i'm talking about ... so it probably make no sense at all :-) )

> Nonsense. You were on the correct path . Hopefully the level of details hasn't
> scared you off now :-)

Well it only gives some more questions :-)
The thing is, pci passthrough and especially the DMA part of it, all work behind the scenes without giving much output about the way it is actually working.

The thing i was wondering about is if my AMD IOMMU is actually doing something for PV guests.
When booting with iommu=off machine has 8GB mem, dom0 limited to 1024M and just starting one domU with iommu=soft, with pci-passthrough and the USB pci-cards with USB videograbbers attached to it, i would expect to find some bounce buffering going.

(HV_START_LOW 18446603336221196288)
(FEATURES '!writable_page_tables|pae_pgdir_above_4gb')
(VIRT_BASE 18446744071562067968)
(GUEST_VERSION 2.6)
(PADDR_OFFSET 0)
(GUEST_OS linux)
(HYPERCALL_PAGE 18446744071578849280)
(LOADER generic)
(SUSPEND_CANCEL 1)
(PAE_MODE yes)
(ENTRY 18446744071594476032)
(XEN_VERSION xen-3.0)

Still i only see:

[ 47.449072] Starting SWIOTLB debug thread.
[ 47.449090] swiotlb_start_thread: Go!
[ 47.449262] xen_swiotlb_start_thread: Go!
[ 52.449158] 0 [ehci_hcd 0000:0a:00.3] bounce: from:432(slow:0)to:1329 map:1756 unmap:1781 sync:0
[ 52.449180] 1 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:16 map:23 unmap:0 sync:0
[ 52.449187] 2 [ohci_hcd 0000:0a:00.4] bounce: from:0(slow:0)to:4 map:5 unmap:0 sync:0
[ 52.449226] SWIOTLB is 0% full
[ 57.449180] 0 ehci_hcd 0000:0a:00.3 alloc coherent: 35, free: 0
[ 57.449219] 1 ohci_hcd 0000:0a:00.6 alloc coherent: 1, free: 0
[ 57.449265] SWIOTLB is 0% full
[ 62.449176] SWIOTLB is 0% full
[ 67.449336] SWIOTLB is 0% full
[ 72.449279] SWIOTLB is 0% full
[ 77.449121] SWIOTLB is 0% full
[ 82.449236] SWIOTLB is 0% full
[ 87.449242] SWIOTLB is 0% full
[ 92.449241] SWIOTLB is 0% full
[ 172.449102] 0 [ehci_hcd 0000:0a:00.7] bounce: from:3839(slow:0)to:664 map:4486 unmap:4617 sync:0
[ 172.449123] 1 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:82 map:111 unmap:0 sync:0
[ 172.449130] 2 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:32 map:36 unmap:0 sync:0
[ 172.449170] SWIOTLB is 0% full
[ 177.449109] 0 [ehci_hcd 0000:0a:00.7] bounce: from:5348(slow:0)to:524 map:5834 unmap:5952 sync:0
[ 177.449131] 1 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:76 map:112 unmap:0 sync:0
[ 177.449138] 2 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:4 map:6 unmap:0 sync:0
[ 177.449178] SWIOTLB is 0% full
[ 182.449143] 0 [ehci_hcd 0000:0a:00.7] bounce: from:5349(slow:0)to:563 map:5899 unmap:5949 sync:0
[ 182.449157] 1 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:27 map:35 unmap:0 sync:0
[ 182.449164] 2 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:10 map:15 unmap:0 sync:0
[ 182.449204] SWIOTLB is 0% full
[ 187.449112] 0 [ehci_hcd 0000:0a:00.7] bounce: from:5375(slow:0)to:592 map:5941 unmap:6022 sync:0
[ 187.449126] 1 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:46 map:69 unmap:0 sync:0
[ 187.449133] 2 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:9 map:12 unmap:0 sync:0
[ 187.449173] SWIOTLB is 0% full
[ 192.449183] 0 [ehci_hcd 0000:0a:00.7] bounce: from:5360(slow:0)to:556 map:5890 unmap:5978 sync:0
[ 192.449226] 1 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:52 map:74 unmap:0 sync:0
[ 192.449234] 2 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:10 map:14 unmap:0 sync:0
[ 192.449275] SWIOTLB is 0% full

And the devices do work ... so how does that work ...

Thx for your explanation so far !

--
Sander







>>
>>
>> --
>> Sander
>>
>>



--
Best regards,
Sander mailto:linux [at] eikelenboom


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad.wilk at oracle

Jan 17, 2012, 1:02 PM

Post #39 of 66 (631 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

> The thing i was wondering about is if my AMD IOMMU is actually doing something for PV guests.
> When booting with iommu=off machine has 8GB mem, dom0 limited to 1024M and just starting one domU with iommu=soft, with pci-passthrough and the USB pci-cards with USB videograbbers attached to it, i would expect to find some bounce buffering going.
>
> (HV_START_LOW 18446603336221196288)
> (FEATURES '!writable_page_tables|pae_pgdir_above_4gb')
> (VIRT_BASE 18446744071562067968)
> (GUEST_VERSION 2.6)
> (PADDR_OFFSET 0)
> (GUEST_OS linux)
> (HYPERCALL_PAGE 18446744071578849280)
> (LOADER generic)
> (SUSPEND_CANCEL 1)
> (PAE_MODE yes)
> (ENTRY 18446744071594476032)
> (XEN_VERSION xen-3.0)
>
> Still i only see:
>
> [ 47.449072] Starting SWIOTLB debug thread.
> [ 47.449090] swiotlb_start_thread: Go!
> [ 47.449262] xen_swiotlb_start_thread: Go!
> [ 52.449158] 0 [ehci_hcd 0000:0a:00.3] bounce: from:432(slow:0)to:1329 map:1756 unmap:1781 sync:0

There is bouncing there.
..
> [ 172.449102] 0 [ehci_hcd 0000:0a:00.7] bounce: from:3839(slow:0)to:664 map:4486 unmap:4617 sync:0

And there.. 3839 of them.
> [ 172.449123] 1 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:82 map:111 unmap:0 sync:0
> [ 172.449130] 2 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:32 map:36 unmap:0 sync:0
> [ 172.449170] SWIOTLB is 0% full
> [ 177.449109] 0 [ehci_hcd 0000:0a:00.7] bounce: from:5348(slow:0)to:524 map:5834 unmap:5952 sync:0

And 5348 here!

So bounce-buffering is definitly happening with this guest.
.. snip..
>
> And the devices do work ... so how does that work ...

Most (all?) drivers are written to work with bounce-buffering.
That has never been a problem.

The issue as I understand is that the DVB drivers allocate their buffers
from 0->4GB most (all the time?) so they never have to do bounce-buffering.

While the pv-ops one ends up quite frequently doing the bounce-buffering, which
implies that the DVB drivers end up allocating their buffers above the 4GB.
This means we end up spending some CPU time (in the guest) copying the memory
from >4GB to 0-4GB region (And vice-versa).

And I am not clear why this is happening. Hence my thought
was to run an Xen-O-Linux kernel v2.6.3X and a PVOPS v2.6.3X (where X is the
same) with the same PCI device (and the test would entail rebooting the
box in between the launches) to confirm that the Xen-O-Linux is doing something
that the PVOPS is not.

So far, I've haven't had much luck compiling a Xen-O-Linux v2.6.38 kernel
so :-(

>
> Thx for your explanation so far !

Sure thing.

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


pasik at iki

Jan 18, 2012, 3:28 AM

Post #40 of 66 (628 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Tue, Jan 17, 2012 at 04:02:25PM -0500, Konrad Rzeszutek Wilk wrote:
> >
> > And the devices do work ... so how does that work ...
>
> Most (all?) drivers are written to work with bounce-buffering.
> That has never been a problem.
>
> The issue as I understand is that the DVB drivers allocate their buffers
> from 0->4GB most (all the time?) so they never have to do bounce-buffering.
>
> While the pv-ops one ends up quite frequently doing the bounce-buffering, which
> implies that the DVB drivers end up allocating their buffers above the 4GB.
> This means we end up spending some CPU time (in the guest) copying the memory
> from >4GB to 0-4GB region (And vice-versa).
>
> And I am not clear why this is happening. Hence my thought
> was to run an Xen-O-Linux kernel v2.6.3X and a PVOPS v2.6.3X (where X is the
> same) with the same PCI device (and the test would entail rebooting the
> box in between the launches) to confirm that the Xen-O-Linux is doing something
> that the PVOPS is not.
>
> So far, I've haven't had much luck compiling a Xen-O-Linux v2.6.38 kernel
> so :-(
>

Did you try downloading a binary rpm (or src.rpm) from OpenSuse?
I think they have 2.6.38 xenlinux kernel available.

-- Pasi


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at suse

Jan 18, 2012, 3:35 AM

Post #41 of 66 (629 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

>>> On 17.01.12 at 22:02, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
> The issue as I understand is that the DVB drivers allocate their buffers
> from 0->4GB most (all the time?) so they never have to do bounce-buffering.
>
> While the pv-ops one ends up quite frequently doing the bounce-buffering,
> which
> implies that the DVB drivers end up allocating their buffers above the 4GB.
> This means we end up spending some CPU time (in the guest) copying the
> memory
> from >4GB to 0-4GB region (And vice-versa).

This reminds me of something (not sure what XenoLinux you use for
comparison) - how are they allocating that memory? Not vmalloc_32()
by chance (I remember having seen numerous uses under - iirc -
drivers/media/)?

Obviously, vmalloc_32() and any GFP_DMA32 allocations do *not* do
what their (driver) callers might expect in a PV guest (including the
contiguity assumption for the latter, recalling that you earlier said
you were able to see the problem after several guest starts), and I
had put into our kernels an adjustment to make vmalloc_32() actually
behave as expected.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at suse

Jan 18, 2012, 3:39 AM

Post #42 of 66 (629 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

>>> On 18.01.12 at 12:28, Pasi Kärkkäinen<pasik [at] iki> wrote:
> On Tue, Jan 17, 2012 at 04:02:25PM -0500, Konrad Rzeszutek Wilk wrote:
>> >
>> > And the devices do work ... so how does that work ...
>>
>> Most (all?) drivers are written to work with bounce-buffering.
>> That has never been a problem.
>>
>> The issue as I understand is that the DVB drivers allocate their buffers
>> from 0->4GB most (all the time?) so they never have to do bounce-buffering.
>>
>> While the pv-ops one ends up quite frequently doing the bounce-buffering,
> which
>> implies that the DVB drivers end up allocating their buffers above the 4GB.
>> This means we end up spending some CPU time (in the guest) copying the
> memory
>> from >4GB to 0-4GB region (And vice-versa).
>>
>> And I am not clear why this is happening. Hence my thought
>> was to run an Xen-O-Linux kernel v2.6.3X and a PVOPS v2.6.3X (where X is the
>> same) with the same PCI device (and the test would entail rebooting the
>> box in between the launches) to confirm that the Xen-O-Linux is doing
> something
>> that the PVOPS is not.
>>
>> So far, I've haven't had much luck compiling a Xen-O-Linux v2.6.38 kernel
>> so :-(
>>
>
> Did you try downloading a binary rpm (or src.rpm) from OpenSuse?
> I think they have 2.6.38 xenlinux kernel available.

openSUSE 11.4 is using 2.6.37; 12.1 is on 3.1 (and SLE is on 3.0).
Pulling out (consistent) patches at 2.6.38 level might be a little
involved.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad at darnok

Jan 18, 2012, 6:29 AM

Post #43 of 66 (632 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Wed, Jan 18, 2012 at 11:35:35AM +0000, Jan Beulich wrote:
> >>> On 17.01.12 at 22:02, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
> > The issue as I understand is that the DVB drivers allocate their buffers
> > from 0->4GB most (all the time?) so they never have to do bounce-buffering.
> >
> > While the pv-ops one ends up quite frequently doing the bounce-buffering,
> > which
> > implies that the DVB drivers end up allocating their buffers above the 4GB.
> > This means we end up spending some CPU time (in the guest) copying the
> > memory
> > from >4GB to 0-4GB region (And vice-versa).
>
> This reminds me of something (not sure what XenoLinux you use for
> comparison) - how are they allocating that memory? Not vmalloc_32()

I was using the 2.6.18, then the one I saw on Google for Gentoo, and now
I am going to look at the 2.6.38 from OpenSuSE.

> by chance (I remember having seen numerous uses under - iirc -
> drivers/media/)?
>
> Obviously, vmalloc_32() and any GFP_DMA32 allocations do *not* do
> what their (driver) callers might expect in a PV guest (including the
> contiguity assumption for the latter, recalling that you earlier said
> you were able to see the problem after several guest starts), and I
> had put into our kernels an adjustment to make vmalloc_32() actually
> behave as expected.

Aaah.. The plot thickens! Let me look in the sources! Thanks for the
pointer.

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad.wilk at oracle

Jan 23, 2012, 2:32 PM

Post #44 of 66 (619 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Wed, Jan 18, 2012 at 10:29:23AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Jan 18, 2012 at 11:35:35AM +0000, Jan Beulich wrote:
> > >>> On 17.01.12 at 22:02, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
> > > The issue as I understand is that the DVB drivers allocate their buffers
> > > from 0->4GB most (all the time?) so they never have to do bounce-buffering.
> > >
> > > While the pv-ops one ends up quite frequently doing the bounce-buffering,
> > > which
> > > implies that the DVB drivers end up allocating their buffers above the 4GB.
> > > This means we end up spending some CPU time (in the guest) copying the
> > > memory
> > > from >4GB to 0-4GB region (And vice-versa).
> >
> > This reminds me of something (not sure what XenoLinux you use for
> > comparison) - how are they allocating that memory? Not vmalloc_32()
>
> I was using the 2.6.18, then the one I saw on Google for Gentoo, and now
> I am going to look at the 2.6.38 from OpenSuSE.
>
> > by chance (I remember having seen numerous uses under - iirc -
> > drivers/media/)?
> >
> > Obviously, vmalloc_32() and any GFP_DMA32 allocations do *not* do
> > what their (driver) callers might expect in a PV guest (including the
> > contiguity assumption for the latter, recalling that you earlier said
> > you were able to see the problem after several guest starts), and I
> > had put into our kernels an adjustment to make vmalloc_32() actually
> > behave as expected.
>
> Aaah.. The plot thickens! Let me look in the sources! Thanks for the
> pointer.

Jan hints lead me to the videobuf-dma-sg.c which does indeed to vmalloc_32
and then performs PCI DMA operations on the allocted vmalloc_32
area.

So I cobbled up the attached patch (hadn't actually tested it and sadly
won't until next week) which removes the call to vmalloc_32 and instead
sets up DMA allocated set of pages.

If that fixes it for you that is awesome, but if it breaks please
send me your logs.

Cheers,
Konrad
Attachments: vmalloc (3.64 KB)


JBeulich at suse

Jan 24, 2012, 12:58 AM

Post #45 of 66 (616 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

>>> On 23.01.12 at 23:32, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
> On Wed, Jan 18, 2012 at 10:29:23AM -0400, Konrad Rzeszutek Wilk wrote:
>> On Wed, Jan 18, 2012 at 11:35:35AM +0000, Jan Beulich wrote:
>> > >>> On 17.01.12 at 22:02, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
>> > > The issue as I understand is that the DVB drivers allocate their buffers
>> > > from 0->4GB most (all the time?) so they never have to do bounce-buffering.
>> > >
>> > > While the pv-ops one ends up quite frequently doing the bounce-buffering,
>> > > which
>> > > implies that the DVB drivers end up allocating their buffers above the
> 4GB.
>> > > This means we end up spending some CPU time (in the guest) copying the
>> > > memory
>> > > from >4GB to 0-4GB region (And vice-versa).
>> >
>> > This reminds me of something (not sure what XenoLinux you use for
>> > comparison) - how are they allocating that memory? Not vmalloc_32()
>>
>> I was using the 2.6.18, then the one I saw on Google for Gentoo, and now
>> I am going to look at the 2.6.38 from OpenSuSE.
>>
>> > by chance (I remember having seen numerous uses under - iirc -
>> > drivers/media/)?
>> >
>> > Obviously, vmalloc_32() and any GFP_DMA32 allocations do *not* do
>> > what their (driver) callers might expect in a PV guest (including the
>> > contiguity assumption for the latter, recalling that you earlier said
>> > you were able to see the problem after several guest starts), and I
>> > had put into our kernels an adjustment to make vmalloc_32() actually
>> > behave as expected.
>>
>> Aaah.. The plot thickens! Let me look in the sources! Thanks for the
>> pointer.
>
> Jan hints lead me to the videobuf-dma-sg.c which does indeed to vmalloc_32
> and then performs PCI DMA operations on the allocted vmalloc_32
> area.
>
> So I cobbled up the attached patch (hadn't actually tested it and sadly
> won't until next week) which removes the call to vmalloc_32 and instead
> sets up DMA allocated set of pages.

What a big patch (which would need re-doing for every vmalloc_32()
caller)! Fixing vmalloc_32() would be much less intrusive (reproducing
our 3.2 version of the affected function below, but clearly that's not
pv-ops ready).

Jan

static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
pgprot_t prot, int node, void *caller)
{
const int order = 0;
struct page **pages;
unsigned int nr_pages, array_size, i;
gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO;
#ifdef CONFIG_XEN
gfp_t dma_mask = gfp_mask & (__GFP_DMA | __GFP_DMA32);

BUILD_BUG_ON((__GFP_DMA | __GFP_DMA32) != (__GFP_DMA + __GFP_DMA32));
if (dma_mask == (__GFP_DMA | __GFP_DMA32))
gfp_mask &= ~(__GFP_DMA | __GFP_DMA32);
#endif

nr_pages = (area->size - PAGE_SIZE) >> PAGE_SHIFT;
array_size = (nr_pages * sizeof(struct page *));

area->nr_pages = nr_pages;
/* Please note that the recursion is strictly bounded. */
if (array_size > PAGE_SIZE) {
pages = __vmalloc_node(array_size, 1, nested_gfp|__GFP_HIGHMEM,
PAGE_KERNEL, node, caller);
area->flags |= VM_VPAGES;
} else {
pages = kmalloc_node(array_size, nested_gfp, node);
}
area->pages = pages;
area->caller = caller;
if (!area->pages) {
remove_vm_area(area->addr);
kfree(area);
return NULL;
}

for (i = 0; i < area->nr_pages; i++) {
struct page *page;
gfp_t tmp_mask = gfp_mask | __GFP_NOWARN;

if (node < 0)
page = alloc_page(tmp_mask);
else
page = alloc_pages_node(node, tmp_mask, order);

if (unlikely(!page)) {
/* Successfully allocated i pages, free them in __vunmap() */
area->nr_pages = i;
goto fail;
}
area->pages[i] = page;
#ifdef CONFIG_XEN
if (dma_mask) {
if (xen_limit_pages_to_max_mfn(page, 0, 32)) {
area->nr_pages = i + 1;
goto fail;
}
if (gfp_mask & __GFP_ZERO)
clear_highpage(page);
}
#endif
}

if (map_vm_area(area, prot, &pages))
goto fail;
return area->addr;

fail:
warn_alloc_failed(gfp_mask, order,
"vmalloc: allocation failure, allocated %ld of %ld bytes\n",
(area->nr_pages*PAGE_SIZE), area->size);
vfree(area->addr);
return NULL;
}

...

#if defined(CONFIG_64BIT) && defined(CONFIG_ZONE_DMA32)
#define GFP_VMALLOC32 GFP_DMA32 | GFP_KERNEL
#elif defined(CONFIG_64BIT) && defined(CONFIG_ZONE_DMA)
#define GFP_VMALLOC32 GFP_DMA | GFP_KERNEL
#elif defined(CONFIG_XEN)
#define GFP_VMALLOC32 __GFP_DMA | __GFP_DMA32 | GFP_KERNEL
#else
#define GFP_VMALLOC32 GFP_KERNEL
#endif


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad.wilk at oracle

Jan 24, 2012, 6:17 AM

Post #46 of 66 (617 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Tue, Jan 24, 2012 at 08:58:22AM +0000, Jan Beulich wrote:
> >>> On 23.01.12 at 23:32, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
> > On Wed, Jan 18, 2012 at 10:29:23AM -0400, Konrad Rzeszutek Wilk wrote:
> >> On Wed, Jan 18, 2012 at 11:35:35AM +0000, Jan Beulich wrote:
> >> > >>> On 17.01.12 at 22:02, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
> >> > > The issue as I understand is that the DVB drivers allocate their buffers
> >> > > from 0->4GB most (all the time?) so they never have to do bounce-buffering.
> >> > >
> >> > > While the pv-ops one ends up quite frequently doing the bounce-buffering,
> >> > > which
> >> > > implies that the DVB drivers end up allocating their buffers above the
> > 4GB.
> >> > > This means we end up spending some CPU time (in the guest) copying the
> >> > > memory
> >> > > from >4GB to 0-4GB region (And vice-versa).
> >> >
> >> > This reminds me of something (not sure what XenoLinux you use for
> >> > comparison) - how are they allocating that memory? Not vmalloc_32()
> >>
> >> I was using the 2.6.18, then the one I saw on Google for Gentoo, and now
> >> I am going to look at the 2.6.38 from OpenSuSE.
> >>
> >> > by chance (I remember having seen numerous uses under - iirc -
> >> > drivers/media/)?
> >> >
> >> > Obviously, vmalloc_32() and any GFP_DMA32 allocations do *not* do
> >> > what their (driver) callers might expect in a PV guest (including the
> >> > contiguity assumption for the latter, recalling that you earlier said
> >> > you were able to see the problem after several guest starts), and I
> >> > had put into our kernels an adjustment to make vmalloc_32() actually
> >> > behave as expected.
> >>
> >> Aaah.. The plot thickens! Let me look in the sources! Thanks for the
> >> pointer.
> >
> > Jan hints lead me to the videobuf-dma-sg.c which does indeed to vmalloc_32
> > and then performs PCI DMA operations on the allocted vmalloc_32
> > area.
> >
> > So I cobbled up the attached patch (hadn't actually tested it and sadly
> > won't until next week) which removes the call to vmalloc_32 and instead
> > sets up DMA allocated set of pages.
>
> What a big patch (which would need re-doing for every vmalloc_32()
> caller)! Fixing vmalloc_32() would be much less intrusive (reproducing
> our 3.2 version of the affected function below, but clearly that's not
> pv-ops ready).

I just want to get to the bottom of this before attempting a proper fix.

>
> Jan
>
> static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> pgprot_t prot, int node, void *caller)
> {
> const int order = 0;
> struct page **pages;
> unsigned int nr_pages, array_size, i;
> gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO;
> #ifdef CONFIG_XEN
> gfp_t dma_mask = gfp_mask & (__GFP_DMA | __GFP_DMA32);
>
> BUILD_BUG_ON((__GFP_DMA | __GFP_DMA32) != (__GFP_DMA + __GFP_DMA32));
> if (dma_mask == (__GFP_DMA | __GFP_DMA32))
> gfp_mask &= ~(__GFP_DMA | __GFP_DMA32);
> #endif
>
> nr_pages = (area->size - PAGE_SIZE) >> PAGE_SHIFT;
> array_size = (nr_pages * sizeof(struct page *));
>
> area->nr_pages = nr_pages;
> /* Please note that the recursion is strictly bounded. */
> if (array_size > PAGE_SIZE) {
> pages = __vmalloc_node(array_size, 1, nested_gfp|__GFP_HIGHMEM,
> PAGE_KERNEL, node, caller);
> area->flags |= VM_VPAGES;
> } else {
> pages = kmalloc_node(array_size, nested_gfp, node);
> }
> area->pages = pages;
> area->caller = caller;
> if (!area->pages) {
> remove_vm_area(area->addr);
> kfree(area);
> return NULL;
> }
>
> for (i = 0; i < area->nr_pages; i++) {
> struct page *page;
> gfp_t tmp_mask = gfp_mask | __GFP_NOWARN;
>
> if (node < 0)
> page = alloc_page(tmp_mask);
> else
> page = alloc_pages_node(node, tmp_mask, order);
>
> if (unlikely(!page)) {
> /* Successfully allocated i pages, free them in __vunmap() */
> area->nr_pages = i;
> goto fail;
> }
> area->pages[i] = page;
> #ifdef CONFIG_XEN
> if (dma_mask) {
> if (xen_limit_pages_to_max_mfn(page, 0, 32)) {
> area->nr_pages = i + 1;
> goto fail;
> }
> if (gfp_mask & __GFP_ZERO)
> clear_highpage(page);
> }
> #endif
> }
>
> if (map_vm_area(area, prot, &pages))
> goto fail;
> return area->addr;
>
> fail:
> warn_alloc_failed(gfp_mask, order,
> "vmalloc: allocation failure, allocated %ld of %ld bytes\n",
> (area->nr_pages*PAGE_SIZE), area->size);
> vfree(area->addr);
> return NULL;
> }
>
> ...
>
> #if defined(CONFIG_64BIT) && defined(CONFIG_ZONE_DMA32)
> #define GFP_VMALLOC32 GFP_DMA32 | GFP_KERNEL
> #elif defined(CONFIG_64BIT) && defined(CONFIG_ZONE_DMA)
> #define GFP_VMALLOC32 GFP_DMA | GFP_KERNEL
> #elif defined(CONFIG_XEN)
> #define GFP_VMALLOC32 __GFP_DMA | __GFP_DMA32 | GFP_KERNEL
> #else
> #define GFP_VMALLOC32 GFP_KERNEL
> #endif

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


carsten at schiers

Jan 24, 2012, 1:32 PM

Post #47 of 66 (617 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

Konrad,

I implemented the patch into a 3.1.2 but the patched function doesn't seem to be called (I set debug=1 for the module).
I think it's only for video capturing devices.

But I greped around and found a vmalloc_32 in drivers/media/common/saa7146_core.c line 182 function saa7146_vmalloc_build_pgtable
which is included in module saa7146.ko. This would be the DVB chip. Maybe you can rework the patch so that we can just test what
you intended to test.

Consequently, the patch you did so far doesn't change the load.

Carsten.




-----Ursprngliche Nachricht-----
Von: xen-devel-bounces [at] lists [mailto:xen-devel-bounces [at] lists] Im Auftrag von Konrad Rzeszutek Wilk
Gesendet: Montag, 23. Januar 2012 23:32
An: Konrad Rzeszutek Wilk
Cc: Sander Eikelenboom; xen-devel; Jan Beulich
Betreff: Re: [Xen-devel] Load increase after memory upgrade (part2)

On Wed, Jan 18, 2012 at 10:29:23AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Jan 18, 2012 at 11:35:35AM +0000, Jan Beulich wrote:
> > >>> On 17.01.12 at 22:02, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
> > > The issue as I understand is that the DVB drivers allocate their
> > > buffers from 0->4GB most (all the time?) so they never have to do bounce-buffering.
> > >
> > > While the pv-ops one ends up quite frequently doing the
> > > bounce-buffering, which implies that the DVB drivers end up
> > > allocating their buffers above the 4GB.
> > > This means we end up spending some CPU time (in the guest) copying
> > > the memory from >4GB to 0-4GB region (And vice-versa).
> >
> > This reminds me of something (not sure what XenoLinux you use for
> > comparison) - how are they allocating that memory? Not vmalloc_32()
>
> I was using the 2.6.18, then the one I saw on Google for Gentoo, and
> now I am going to look at the 2.6.38 from OpenSuSE.
>
> > by chance (I remember having seen numerous uses under - iirc -
> > drivers/media/)?
> >
> > Obviously, vmalloc_32() and any GFP_DMA32 allocations do *not* do
> > what their (driver) callers might expect in a PV guest (including
> > the contiguity assumption for the latter, recalling that you earlier
> > said you were able to see the problem after several guest starts),
> > and I had put into our kernels an adjustment to make vmalloc_32()
> > actually behave as expected.
>
> Aaah.. The plot thickens! Let me look in the sources! Thanks for the
> pointer.

Jan hints lead me to the videobuf-dma-sg.c which does indeed to vmalloc_32 and then performs PCI DMA operations on the allocted vmalloc_32 area.

So I cobbled up the attached patch (hadn't actually tested it and sadly won't until next week) which removes the call to vmalloc_32 and instead sets up DMA allocated set of pages.

If that fixes it for you that is awesome, but if it breaks please send me your logs.

Cheers,
Konrad
_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


carsten at schiers

Jan 25, 2012, 4:02 AM

Post #48 of 66 (617 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

I can now confirm that saa7146_vmalloc_build_pgtable and vmalloc_to_sg are called once per

PCI card and will allocate 329 pages. Sorry, but I am not in the position to modify your patch

to patch the functions in the right way, but happy to test...

 
BR, Carsten.
 
-----Ursprüngliche Nachricht-----
An:Konrad Rzeszutek Wilk <konrad [at] darnok>;
CC:Sander Eikelenboom <linux [at] eikelenboom>; xen-devel <xen-devel [at] lists>; Jan Beulich <JBeulich [at] suse>;
Von:Konrad Rzeszutek Wilk <konrad.wilk [at] oracle>
Gesendet:Mo 23.01.2012 23:42
Betreff:Re: [Xen-devel] Load increase after memory upgrade (part2)
Anlage:vmalloc
On Wed, Jan 18, 2012 at 10:29:23AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Jan 18, 2012 at 11:35:35AM +0000, Jan Beulich wrote:
> > >>> On 17.01.12 at 22:02, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
> > > The issue as I understand is that the DVB drivers allocate their buffers
> > > from 0->4GB most (all the time?) so they never have to do bounce-buffering.
> > >
> > > While the pv-ops one ends up quite frequently doing the bounce-buffering,
> > > which
> > > implies that the DVB drivers end up allocating their buffers above the 4GB.
> > > This means we end up spending some CPU time (in the guest) copying the
> > > memory
> > > from >4GB to 0-4GB region (And vice-versa).
> >
> > This reminds me of something (not sure what XenoLinux you use for
> > comparison) - how are they allocating that memory? Not vmalloc_32()
>
> I was using the 2.6.18, then the one I saw on Google for Gentoo, and now
> I am going to look at the 2.6.38 from OpenSuSE.
>
> > by chance (I remember having seen numerous uses under - iirc -
> > drivers/media/)?
> >
> > Obviously, vmalloc_32() and any GFP_DMA32 allocations do *not* do
> > what their (driver) callers might expect in a PV guest (including the
> > contiguity assumption for the latter, recalling that you earlier said
> > you were able to see the problem after several guest starts), and I
> > had put into our kernels an adjustment to make vmalloc_32() actually
> > behave as expected.
>
> Aaah.. The plot thickens! Let me look in the sources! Thanks for the
> pointer.

Jan hints lead me to the videobuf-dma-sg.c which does indeed to vmalloc_32
and then performs PCI DMA operations on the allocted vmalloc_32
area.

So I cobbled up the attached patch (hadn't actually tested it and sadly
won't until next week) which removes the call to vmalloc_32 and instead
sets up DMA allocated set of pages.

If that fixes it for you that is awesome, but if it breaks please
send me your logs.

Cheers,
Konrad
_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


carsten at schiers

Jan 25, 2012, 11:06 AM

Post #49 of 66 (617 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

Some news: in order to prepare a clean setting, I upgraded to 3.2.1 kernel. I noticed that the load increase is
reduced a bit, but noticably. It's only a simple test, running the DomU for 2 minutes, but the idle load is aprox.

- 2.6.32 pvops 12-13%
- 3.2.1 pvops 10-11%
- 2.6.34 XenoLinux 7-8%

BR, Carsten.


-----Ursprngliche Nachricht-----
Von: xen-devel-bounces [at] lists [mailto:xen-devel-bounces [at] lists] Im Auftrag von Konrad Rzeszutek Wilk
Gesendet: Montag, 23. Januar 2012 23:32
An: Konrad Rzeszutek Wilk
Cc: Sander Eikelenboom; xen-devel; Jan Beulich
Betreff: Re: [Xen-devel] Load increase after memory upgrade (part2)

On Wed, Jan 18, 2012 at 10:29:23AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Jan 18, 2012 at 11:35:35AM +0000, Jan Beulich wrote:
> > >>> On 17.01.12 at 22:02, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
> > > The issue as I understand is that the DVB drivers allocate their
> > > buffers from 0->4GB most (all the time?) so they never have to do bounce-buffering.
> > >
> > > While the pv-ops one ends up quite frequently doing the
> > > bounce-buffering, which implies that the DVB drivers end up
> > > allocating their buffers above the 4GB.
> > > This means we end up spending some CPU time (in the guest) copying
> > > the memory from >4GB to 0-4GB region (And vice-versa).
> >
> > This reminds me of something (not sure what XenoLinux you use for
> > comparison) - how are they allocating that memory? Not vmalloc_32()
>
> I was using the 2.6.18, then the one I saw on Google for Gentoo, and
> now I am going to look at the 2.6.38 from OpenSuSE.
>
> > by chance (I remember having seen numerous uses under - iirc -
> > drivers/media/)?
> >
> > Obviously, vmalloc_32() and any GFP_DMA32 allocations do *not* do
> > what their (driver) callers might expect in a PV guest (including
> > the contiguity assumption for the latter, recalling that you earlier
> > said you were able to see the problem after several guest starts),
> > and I had put into our kernels an adjustment to make vmalloc_32()
> > actually behave as expected.
>
> Aaah.. The plot thickens! Let me look in the sources! Thanks for the
> pointer.

Jan hints lead me to the videobuf-dma-sg.c which does indeed to vmalloc_32 and then performs PCI DMA operations on the allocted vmalloc_32 area.

So I cobbled up the attached patch (hadn't actually tested it and sadly won't until next week) which removes the call to vmalloc_32 and instead sets up DMA allocated set of pages.

If that fixes it for you that is awesome, but if it breaks please send me your logs.

Cheers,
Konrad
_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad.wilk at oracle

Jan 25, 2012, 1:02 PM

Post #50 of 66 (620 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Wed, Jan 25, 2012 at 08:06:12PM +0100, Carsten Schiers wrote:
> Some news: in order to prepare a clean setting, I upgraded to 3.2.1 kernel. I noticed that the load increase is
> reduced a bit, but noticably. It's only a simple test, running the DomU for 2 minutes, but the idle load is aprox.
>
> - 2.6.32 pvops 12-13%
> - 3.2.1 pvops 10-11%

Yeah. I think this idue to the fix I added in xen-swiotlb to not always
do the bounce copying.

> - 2.6.34 XenoLinux 7-8%
>

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel

First page Previous page 1 2 3 Next page Last page  View All Xen devel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.