Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Xen: Devel

Load increase after memory upgrade (part2)

 

 

First page Previous page 1 2 3 Next page Last page  View All Xen devel RSS feed   Index | Next | Previous | View Threaded


carsten at schiers

Nov 24, 2011, 4:28 AM

Post #1 of 66 (750 views)
Permalink
Load increase after memory upgrade (part2)

Hello again, I would like to come back to that thing...sorry that I did not have the time up to now.

 
We (now) speak about

 
* Xen 4.1.2
* Dom0 is Jeremy's 2.6.32.46 64 bit
* DomU in question is now 3.1.2 64 bit
* Same thing if DomU is also 2.6.32.46
* DomU owns two PCI cards (DVB-C) that o DMA
* Machine has 8GB, Dom0 pinned at 512MB

 
As compared to 2.6.34 Kernel with backported patches, the load on the DomU is at least twice as high. It

will be "close to normal" if I reduce the memory used to 4GB.

 
As you can see from the attachment, you once had an idea. So should we try to find something...?

 
Carsten.
 
-----Ursprüngliche Nachricht-----
An:konrad.wilk <konrad.wilk [at] oracle>;
CC:linux <linux [at] eikelenboom>; xen-devel <xen-devel [at] lists>;
Von:Carsten Schiers <carsten [at] schiers>
Gesendet:Mi 29.06.2011 23:17
Betreff:AW: Re: Re: Re: AW: Re: [Xen-devel] AW: Load increase after memory upgrade?
> Lets first do the c) experiment as that will likely explain your load average increase.
...
> >c). If you want to see if the fault here lies in the bounce buffer
> being used more
> >often in the DomU b/c you have 8GB of memory now and you end up using
> more pages
> >past 4GB (in DomU), I can cook up a patch to figure this out. But an
> easier way is
> >to just do (on the Xen hypervisor line): mem=4G and that will make
> think you only have
> >4GB of physical RAM.  If the load comes back to the normal "amount"
> then the likely
> >culprit is that and we can think on how to fix this.

You are on the right track. Load was going down to "normal" 10% when reducing
Xen to 4GB by the parameter. Load seems to be still a little, little bit lower
with Xenified Kernel (8-9%), but this is drastically lower than the 20% we had
before.


konrad at darnok

Nov 25, 2011, 10:42 AM

Post #2 of 66 (736 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Thu, Nov 24, 2011 at 01:28:44PM +0100, Carsten Schiers wrote:
> Hello again, I would like to come back to that thing...sorry that I did not have the time up to now.
>
> ??
> We (now) speak about
>
> ??
> * Xen 4.1.2
> * Dom0 is Jeremy's 2.6.32.46 64 bit
> * DomU in question is now 3.1.2 64 bit
> * Same thing if DomU is also 2.6.32.46
> * DomU owns two PCI cards (DVB-C) that o DMA
> * Machine has 8GB, Dom0 pinned at 512MB
>
> ??
> As compared to 2.6.34 Kernel with backported patches, the load on the DomU is at least twice as high. It
>
> will be "close to normal" if I reduce the memory used to 4GB.

That is in the dom0 or just in general on the machine?
>
> ??
> As you can see from the attachment, you once had an idea. So should we try to find something...?

I think that was to instrument swiotlb to give an idea of how
often it is called and basically have a matrix of its load. And
from there figure out if the issue is that:

1). The drivers allocoate/bounce/deallocate buffers on every interrupt
(bad, driver should be using some form of dma pool and most of the
ivtv do that)

2). The buffers allocated to the drivers are above the 4GB and we end
up bouncing it needlessly. That can happen if the dom0 has most of
the precious memory under 4GB. However, that is usually not the case
as the domain isusually allocated from the top of the memory. The
fix for that was to set dom0_mem=max:XX. .. but with Dom0 kernels
before 3.1, the parameter would be ignored, so you had to use
'mem=XX' on the Linux command line as well.

3). Where did you get the load values? Was it dom0? or domU?



>
> ??
> Carsten.
> ??
> -----Urspr??ngliche Nachricht-----
> An:konrad.wilk <konrad.wilk [at] oracle>;
> CC:linux <linux [at] eikelenboom>; xen-devel <xen-devel [at] lists>;
> Von:Carsten Schiers <carsten [at] schiers>
> Gesendet:Mi 29.06.2011 23:17
> Betreff:AW: Re: Re: Re: AW: Re: [Xen-devel] AW: Load increase after memory upgrade?
> > Lets first do the c) experiment as that will likely explain your load average increase.
> ...
> > >c). If you want to see if the fault here lies in the bounce buffer
> > being used more
> > >often in the DomU b/c you have 8GB of memory now and you end up using
> > more pages
> > >past 4GB (in DomU), I can cook up a patch to figure this out. But an
> > easier way is
> > >to just do (on the Xen hypervisor line): mem=4G and that will make
> > think you only have
> > >4GB of physical RAM. ??If the load comes back to the normal "amount"
> > then the likely
> > >culprit is that and we can think on how to fix this.
>
> You are on the right track. Load was going down to "normal" 10% when reducing
> Xen to 4GB by the parameter. Load seems to be still a little, little bit lower
> with Xenified Kernel (8-9%), but this is drastically lower than the 20% we had
> before.

> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


carsten at schiers

Nov 25, 2011, 2:11 PM

Post #3 of 66 (736 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

I got the values in DomU. I will have

- aprox. 5% load in DomU with 2.6.34 Xenified Kernel
- aprox. 15% load in DomU with 2.6.32.46 Jeremy or 3.1.2 Kernel with one card attached
- aprox. 30% load in DomU with 2.6.32.46 Jeremy or 3.1.2 Kernel with two cards attached

I looked through my old mails from you and you explained already the necessity of double
bounce buffering (PCI->below 4GB->above 4GB). What I don't understand is: why does the
Xenified kernel not have this kind of issue?

The driver in question is nearly identical between the two kernel versions. It is in
Drivers/media/dvb/ttpci by the way and when I understood the code right, the allo in
question is:

/* allocate and init buffers */
av7110->debi_virt = pci_alloc_consistent(pdev, 8192, &av7110->debi_bus);
if (!av7110->debi_virt)
goto err_saa71466_vfree_4;

isn't it? I think the cards are constantly transferring the stream received through DMA.

I have set dom0_mem=512M by the way, shall I change that in some way?

I can try out some things, if you want me to. But I have no idea what to do and where to
start, so I rely on your help...

Carsten.

-----Ursprngliche Nachricht-----
Von: xen-devel-bounces [at] lists [mailto:xen-devel-bounces [at] lists] Im Auftrag von Konrad Rzeszutek Wilk
Gesendet: Freitag, 25. November 2011 19:43
An: Carsten Schiers
Cc: xen-devel; konrad.wilk
Betreff: Re: [Xen-devel] Load increase after memory upgrade (part2)

On Thu, Nov 24, 2011 at 01:28:44PM +0100, Carsten Schiers wrote:
> Hello again, I would like to come back to that thing...sorry that I did not have the time up to now.
>
> ??
> We (now) speak about
>
> ??
> * Xen 4.1.2
> * Dom0 is Jeremy's 2.6.32.46 64 bit
> * DomU in question is now 3.1.2 64 bit
> * Same thing if DomU is also 2.6.32.46
> * DomU owns two PCI cards (DVB-C) that o DMA
> * Machine has 8GB, Dom0 pinned at 512MB
>
> ??
> As compared to 2.6.34 Kernel with backported patches, the load on the DomU is at least twice as high. It
>
> will be "close to normal" if I reduce the memory used to 4GB.

That is in the dom0 or just in general on the machine?
>
> ??
> As you can see from the attachment, you once had an idea. So should we try to find something...?

I think that was to instrument swiotlb to give an idea of how
often it is called and basically have a matrix of its load. And
from there figure out if the issue is that:

1). The drivers allocoate/bounce/deallocate buffers on every interrupt
(bad, driver should be using some form of dma pool and most of the
ivtv do that)

2). The buffers allocated to the drivers are above the 4GB and we end
up bouncing it needlessly. That can happen if the dom0 has most of
the precious memory under 4GB. However, that is usually not the case
as the domain isusually allocated from the top of the memory. The
fix for that was to set dom0_mem=max:XX. .. but with Dom0 kernels
before 3.1, the parameter would be ignored, so you had to use
'mem=XX' on the Linux command line as well.

3). Where did you get the load values? Was it dom0? or domU?



>
> ??
> Carsten.
> ??
> -----Urspr??ngliche Nachricht-----
> An:konrad.wilk <konrad.wilk [at] oracle>;
> CC:linux <linux [at] eikelenboom>; xen-devel <xen-devel [at] lists>;
> Von:Carsten Schiers <carsten [at] schiers>
> Gesendet:Mi 29.06.2011 23:17
> Betreff:AW: Re: Re: Re: AW: Re: [Xen-devel] AW: Load increase after memory upgrade?
> > Lets first do the c) experiment as that will likely explain your load average increase.
> ...
> > >c). If you want to see if the fault here lies in the bounce buffer
> > being used more
> > >often in the DomU b/c you have 8GB of memory now and you end up using
> > more pages
> > >past 4GB (in DomU), I can cook up a patch to figure this out. But an
> > easier way is
> > >to just do (on the Xen hypervisor line): mem=4G and that will make
> > think you only have
> > >4GB of physical RAM. ??If the load comes back to the normal "amount"
> > then the likely
> > >culprit is that and we can think on how to fix this.
>
> You are on the right track. Load was going down to "normal" 10% when reducing
> Xen to 4GB by the parameter. Load seems to be still a little, little bit lower
> with Xenified Kernel (8-9%), but this is drastically lower than the 20% we had
> before.

> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


carsten at schiers

Nov 26, 2011, 1:14 AM

Post #4 of 66 (734 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

To add (read from some munin statistics I made over the time):

- with load I mean the %CPU of xentop
- there is no change in CPU usage of the DomU or Dom0
- xenpm shows the core dedicated to that DomU is doing more work

Also I need to say that reduction to 4GB was performed by Xen parameter.

Carsten.


-----Ursprngliche Nachricht-----
Von: Konrad Rzeszutek Wilk [mailto:konrad [at] darnok]
Gesendet: Freitag, 25. November 2011 19:43
An: Carsten Schiers
Cc: konrad.wilk; xen-devel
Betreff: Re: [Xen-devel] Load increase after memory upgrade (part2)

On Thu, Nov 24, 2011 at 01:28:44PM +0100, Carsten Schiers wrote:
> Hello again, I would like to come back to that thing...sorry that I did not have the time up to now.
>
> ??
> We (now) speak about
>
> ??
> * Xen 4.1.2
> * Dom0 is Jeremy's 2.6.32.46 64 bit
> * DomU in question is now 3.1.2 64 bit
> * Same thing if DomU is also 2.6.32.46
> * DomU owns two PCI cards (DVB-C) that o DMA
> * Machine has 8GB, Dom0 pinned at 512MB
>
> ??
> As compared to 2.6.34 Kernel with backported patches, the load on the DomU is at least twice as high. It
>
> will be "close to normal" if I reduce the memory used to 4GB.

That is in the dom0 or just in general on the machine?
>
> ??
> As you can see from the attachment, you once had an idea. So should we try to find something...?

I think that was to instrument swiotlb to give an idea of how
often it is called and basically have a matrix of its load. And
from there figure out if the issue is that:

1). The drivers allocoate/bounce/deallocate buffers on every interrupt
(bad, driver should be using some form of dma pool and most of the
ivtv do that)

2). The buffers allocated to the drivers are above the 4GB and we end
up bouncing it needlessly. That can happen if the dom0 has most of
the precious memory under 4GB. However, that is usually not the case
as the domain isusually allocated from the top of the memory. The
fix for that was to set dom0_mem=max:XX. .. but with Dom0 kernels
before 3.1, the parameter would be ignored, so you had to use
'mem=XX' on the Linux command line as well.

3). Where did you get the load values? Was it dom0? or domU?



>
> ??
> Carsten.
> ??
> -----Urspr??ngliche Nachricht-----
> An:konrad.wilk <konrad.wilk [at] oracle>;
> CC:linux <linux [at] eikelenboom>; xen-devel <xen-devel [at] lists>;
> Von:Carsten Schiers <carsten [at] schiers>
> Gesendet:Mi 29.06.2011 23:17
> Betreff:AW: Re: Re: Re: AW: Re: [Xen-devel] AW: Load increase after memory upgrade?
> > Lets first do the c) experiment as that will likely explain your load average increase.
> ...
> > >c). If you want to see if the fault here lies in the bounce buffer
> > being used more
> > >often in the DomU b/c you have 8GB of memory now and you end up using
> > more pages
> > >past 4GB (in DomU), I can cook up a patch to figure this out. But an
> > easier way is
> > >to just do (on the Xen hypervisor line): mem=4G and that will make
> > think you only have
> > >4GB of physical RAM. ??If the load comes back to the normal "amount"
> > then the likely
> > >culprit is that and we can think on how to fix this.
>
> You are on the right track. Load was going down to "normal" 10% when reducing
> Xen to 4GB by the parameter. Load seems to be still a little, little bit lower
> with Xenified Kernel (8-9%), but this is drastically lower than the 20% we had
> before.

> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad at darnok

Nov 28, 2011, 7:28 AM

Post #5 of 66 (733 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Fri, Nov 25, 2011 at 11:11:55PM +0100, Carsten Schiers wrote:
> I got the values in DomU. I will have
>
> - aprox. 5% load in DomU with 2.6.34 Xenified Kernel
> - aprox. 15% load in DomU with 2.6.32.46 Jeremy or 3.1.2 Kernel with one card attached
> - aprox. 30% load in DomU with 2.6.32.46 Jeremy or 3.1.2 Kernel with two cards attached

HA!

I just wonder if the issue is that the reporting of CPU spent is wrong.
Laszlo Ersek and Zhenzhong Duan have both reported a bug in the pvops
code when it came to account of CPU time.

>
> I looked through my old mails from you and you explained already the necessity of double
> bounce buffering (PCI->below 4GB->above 4GB). What I don't understand is: why does the
> Xenified kernel not have this kind of issue?

That is a puzzle. It should not. The code is very much the same - both
use the generic SWIOTLB which has not changed for years.
>
> The driver in question is nearly identical between the two kernel versions. It is in
> Drivers/media/dvb/ttpci by the way and when I understood the code right, the allo in
> question is:
>
> /* allocate and init buffers */
> av7110->debi_virt = pci_alloc_consistent(pdev, 8192, &av7110->debi_bus);

Good. So it allocates it during init and uses it.
> if (!av7110->debi_virt)
> goto err_saa71466_vfree_4;
>
> isn't it? I think the cards are constantly transferring the stream received through DMA.

Yeah, and that memory is set aside for the life of the driver. So there
should be no bounce buffering happening (as it allocated the memory
below the 4GB mark).
>
> I have set dom0_mem=512M by the way, shall I change that in some way?

Does the reporting (CPU usage of DomU) change in any way with that?
>
> I can try out some things, if you want me to. But I have no idea what to do and where to
> start, so I rely on your help...
>
> Carsten.
>
> -----Urspr?ngliche Nachricht-----
> Von: xen-devel-bounces [at] lists [mailto:xen-devel-bounces [at] lists] Im Auftrag von Konrad Rzeszutek Wilk
> Gesendet: Freitag, 25. November 2011 19:43
> An: Carsten Schiers
> Cc: xen-devel; konrad.wilk
> Betreff: Re: [Xen-devel] Load increase after memory upgrade (part2)
>
> On Thu, Nov 24, 2011 at 01:28:44PM +0100, Carsten Schiers wrote:
> > Hello again, I would like to come back to that thing...sorry that I did not have the time up to now.
> >
> > ??
> > We (now) speak about
> >
> > ??
> > * Xen 4.1.2
> > * Dom0 is Jeremy's 2.6.32.46 64 bit
> > * DomU in question is now 3.1.2 64 bit
> > * Same thing if DomU is also 2.6.32.46
> > * DomU owns two PCI cards (DVB-C) that o DMA
> > * Machine has 8GB, Dom0 pinned at 512MB
> >
> > ??
> > As compared to 2.6.34 Kernel with backported patches, the load on the DomU is at least twice as high. It
> >
> > will be "close to normal" if I reduce the memory used to 4GB.
>
> That is in the dom0 or just in general on the machine?
> >
> > ??
> > As you can see from the attachment, you once had an idea. So should we try to find something...?
>
> I think that was to instrument swiotlb to give an idea of how
> often it is called and basically have a matrix of its load. And
> from there figure out if the issue is that:
>
> 1). The drivers allocoate/bounce/deallocate buffers on every interrupt
> (bad, driver should be using some form of dma pool and most of the
> ivtv do that)
>
> 2). The buffers allocated to the drivers are above the 4GB and we end
> up bouncing it needlessly. That can happen if the dom0 has most of
> the precious memory under 4GB. However, that is usually not the case
> as the domain isusually allocated from the top of the memory. The
> fix for that was to set dom0_mem=max:XX. .. but with Dom0 kernels
> before 3.1, the parameter would be ignored, so you had to use
> 'mem=XX' on the Linux command line as well.
>
> 3). Where did you get the load values? Was it dom0? or domU?
>
>
>
> >
> > ??
> > Carsten.
> > ??
> > -----Urspr??ngliche Nachricht-----
> > An:konrad.wilk <konrad.wilk [at] oracle>;
> > CC:linux <linux [at] eikelenboom>; xen-devel <xen-devel [at] lists>;
> > Von:Carsten Schiers <carsten [at] schiers>
> > Gesendet:Mi 29.06.2011 23:17
> > Betreff:AW: Re: Re: Re: AW: Re: [Xen-devel] AW: Load increase after memory upgrade?
> > > Lets first do the c) experiment as that will likely explain your load average increase.
> > ...
> > > >c). If you want to see if the fault here lies in the bounce buffer
> > > being used more
> > > >often in the DomU b/c you have 8GB of memory now and you end up using
> > > more pages
> > > >past 4GB (in DomU), I can cook up a patch to figure this out. But an
> > > easier way is
> > > >to just do (on the Xen hypervisor line): mem=4G and that will make
> > > think you only have
> > > >4GB of physical RAM. ??If the load comes back to the normal "amount"
> > > then the likely
> > > >culprit is that and we can think on how to fix this.
> >
> > You are on the right track. Load was going down to "normal" 10% when reducing
> > Xen to 4GB by the parameter. Load seems to be still a little, little bit lower
> > with Xenified Kernel (8-9%), but this is drastically lower than the 20% we had
> > before.
>
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel [at] lists
> > http://lists.xensource.com/xen-devel
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xensource.com/xen-devel
>

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad at darnok

Nov 28, 2011, 7:30 AM

Post #6 of 66 (736 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Sat, Nov 26, 2011 at 10:14:08AM +0100, Carsten Schiers wrote:
> To add (read from some munin statistics I made over the time):
>
> - with load I mean the %CPU of xentop
> - there is no change in CPU usage of the DomU or Dom0

Uhh, which matrix are using for that? CPU usage...? This is if you
change the DomU or the amount of memory the guest has? This is not
the load number (xentop value)?

> - xenpm shows the core dedicated to that DomU is doing more work
>
> Also I need to say that reduction to 4GB was performed by Xen parameter.
>
> Carsten.
>
>
> -----Urspr?ngliche Nachricht-----
> Von: Konrad Rzeszutek Wilk [mailto:konrad [at] darnok]
> Gesendet: Freitag, 25. November 2011 19:43
> An: Carsten Schiers
> Cc: konrad.wilk; xen-devel
> Betreff: Re: [Xen-devel] Load increase after memory upgrade (part2)
>
> On Thu, Nov 24, 2011 at 01:28:44PM +0100, Carsten Schiers wrote:
> > Hello again, I would like to come back to that thing...sorry that I did not have the time up to now.
> >
> > ??
> > We (now) speak about
> >
> > ??
> > * Xen 4.1.2
> > * Dom0 is Jeremy's 2.6.32.46 64 bit
> > * DomU in question is now 3.1.2 64 bit
> > * Same thing if DomU is also 2.6.32.46
> > * DomU owns two PCI cards (DVB-C) that o DMA
> > * Machine has 8GB, Dom0 pinned at 512MB
> >
> > ??
> > As compared to 2.6.34 Kernel with backported patches, the load on the DomU is at least twice as high. It
> >
> > will be "close to normal" if I reduce the memory used to 4GB.
>
> That is in the dom0 or just in general on the machine?
> >
> > ??
> > As you can see from the attachment, you once had an idea. So should we try to find something...?
>
> I think that was to instrument swiotlb to give an idea of how
> often it is called and basically have a matrix of its load. And
> from there figure out if the issue is that:
>
> 1). The drivers allocoate/bounce/deallocate buffers on every interrupt
> (bad, driver should be using some form of dma pool and most of the
> ivtv do that)
>
> 2). The buffers allocated to the drivers are above the 4GB and we end
> up bouncing it needlessly. That can happen if the dom0 has most of
> the precious memory under 4GB. However, that is usually not the case
> as the domain isusually allocated from the top of the memory. The
> fix for that was to set dom0_mem=max:XX. .. but with Dom0 kernels
> before 3.1, the parameter would be ignored, so you had to use
> 'mem=XX' on the Linux command line as well.
>
> 3). Where did you get the load values? Was it dom0? or domU?
>
>
>
> >
> > ??
> > Carsten.
> > ??
> > -----Urspr??ngliche Nachricht-----
> > An:konrad.wilk <konrad.wilk [at] oracle>;
> > CC:linux <linux [at] eikelenboom>; xen-devel <xen-devel [at] lists>;
> > Von:Carsten Schiers <carsten [at] schiers>
> > Gesendet:Mi 29.06.2011 23:17
> > Betreff:AW: Re: Re: Re: AW: Re: [Xen-devel] AW: Load increase after memory upgrade?
> > > Lets first do the c) experiment as that will likely explain your load average increase.
> > ...
> > > >c). If you want to see if the fault here lies in the bounce buffer
> > > being used more
> > > >often in the DomU b/c you have 8GB of memory now and you end up using
> > > more pages
> > > >past 4GB (in DomU), I can cook up a patch to figure this out. But an
> > > easier way is
> > > >to just do (on the Xen hypervisor line): mem=4G and that will make
> > > think you only have
> > > >4GB of physical RAM. ??If the load comes back to the normal "amount"
> > > then the likely
> > > >culprit is that and we can think on how to fix this.
> >
> > You are on the right track. Load was going down to "normal" 10% when reducing
> > Xen to 4GB by the parameter. Load seems to be still a little, little bit lower
> > with Xenified Kernel (8-9%), but this is drastically lower than the 20% we had
> > before.
>
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel [at] lists
> > http://lists.xensource.com/xen-devel
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


Ian.Campbell at citrix

Nov 28, 2011, 7:40 AM

Post #7 of 66 (734 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Mon, 2011-11-28 at 15:28 +0000, Konrad Rzeszutek Wilk wrote:
> On Fri, Nov 25, 2011 at 11:11:55PM +0100, Carsten Schiers wrote:

> > I looked through my old mails from you and you explained already the necessity of double
> > bounce buffering (PCI->below 4GB->above 4GB). What I don't understand is: why does the
> > Xenified kernel not have this kind of issue?
>
> That is a puzzle. It should not. The code is very much the same - both
> use the generic SWIOTLB which has not changed for years.

The swiotlb-xen used by classic-xen kernels (which I assume is what
Carsten means by "Xenified") isn't exactly the same as the stuff in
mainline Linux, it's been heavily refactored for one thing. It's not
impossible that mainline is bouncing something it doesn't really need
to.

It's also possible that the dma mask of the device is different/wrong in
mainline leading to such additional bouncing.

I guess it's also possible that the classic-Xen kernels are playing fast
and loose by not bouncing something they should (although if so they
appear to be getting away with it...) or that there is some difference
which really means mainline needs to bounce while classic-Xen doesn't.

Ian.



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


carsten at schiers

Nov 28, 2011, 7:52 AM

Post #8 of 66 (733 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

Hi,

 
let me try to explain a bit more. Here you see the output of my xentop munin graph for a

week. Only take a look at the bluish buckle. Notice the small step in front? So it's the CPU

permille used by the DomU that owns the cards. The small buckle is when I only put in

one PCI card. Afterwards it's constantly noticable higher load. See that Dom0 (green) is

not impacted. I am back to the Xenified kernel, as you can see.

 

 
In the next picture you see the output of xenpm visualized. So this might be an indicator that

realy something happens. It's only the core that I dedicated to that DomU. I have a three-core

AMD CPU by the way:

 

 
In CPU usage of the Dom0, there is nothing to see:

 

 
In CPU usage of the DomU, there is also not much to see, eventually a very slight change of

mix:

 

 
There is a slight increase in sleaping jobs at the time slot in question, I guess nothing we ca

directly map to the issue:

 

 
If you need other charts, I can try to produce them.

 
BR,
Carsten.

 
-----Ursprüngliche Nachricht-----
An:Carsten Schiers <carsten [at] schiers>; zhenzhong.duan [at] oracle; lersek [at] redhat;
CC:xen-devel <xen-devel [at] lists>; konrad.wilk <konrad.wilk [at] oracle>;
Von:Konrad Rzeszutek Wilk <konrad [at] darnok>
Gesendet:Mo 28.11.2011 16:33
Betreff:Re: [Xen-devel] Load increase after memory upgrade (part2)
On Fri, Nov 25, 2011 at 11:11:55PM +0100, Carsten Schiers wrote:
> I got the values in DomU. I will have
>
>   - aprox. 5% load in DomU with 2.6.34 Xenified Kernel
>   - aprox. 15% load in DomU with 2.6.32.46 Jeremy or 3.1.2 Kernel with one card attached
>   - aprox. 30% load in DomU with 2.6.32.46 Jeremy or 3.1.2 Kernel with two cards attached

HA!

I just wonder if the issue is that the reporting of CPU spent is wrong.
Laszlo Ersek and Zhenzhong Duan have both reported a bug in the pvops
code when it came to account of CPU time.

>
> I looked through my old mails from you and you explained already the necessity of double
> bounce buffering (PCI->below 4GB->above 4GB). What I don't understand is: why does the
> Xenified kernel not have this kind of issue?

That is a puzzle. It should not. The code is very much the same - both
use the generic SWIOTLB which has not changed for years.
>
> The driver in question is nearly identical between the two kernel versions. It is in
> Drivers/media/dvb/ttpci by the way and when I understood the code right, the allo in
> question is:
>
>         /* allocate and init buffers */
>         av7110->debi_virt = pci_alloc_consistent(pdev, 8192, &av7110->debi_bus);

Good. So it allocates it during init and uses it.
>         if (!av7110->debi_virt)
>                 goto err_saa71466_vfree_4;
>
> isn't it? I think the cards are constantly transferring the stream received through DMA.

Yeah, and that memory is set aside for the life of the driver. So there
should be no bounce buffering happening (as it allocated the memory
below the 4GB mark).
>
> I have set dom0_mem=512M by the way, shall I change that in some way?

Does the reporting (CPU usage of DomU) change in any way with that?
>
> I can try out some things, if you want me to. But I have no idea what to do and where to
> start, so I rely on your help...
>
> Carsten.
>
> -----Urspr?ngliche Nachricht-----
> Von: xen-devel-bounces [at] lists [mailto:xen-devel-bounces [at] lists] Im Auftrag von Konrad Rzeszutek Wilk
> Gesendet: Freitag, 25. November 2011 19:43
> An: Carsten Schiers
> Cc: xen-devel; konrad.wilk
> Betreff: Re: [Xen-devel] Load increase after memory upgrade (part2)
>
> On Thu, Nov 24, 2011 at 01:28:44PM +0100, Carsten Schiers wrote:
> > Hello again, I would like to come back to that thing...sorry that I did not have the time up to now.
> >
> > ??
> > We (now) speak about
> >
> > ??
> > *Xen 4.1.2
> > *Dom0 is Jeremy's 2.6.32.46 64 bit
> > *DomU in question is now 3.1.2 64 bit
> > *Same thing if DomU is also 2.6.32.46
> > *DomU owns two PCI cards (DVB-C) that o DMA
> > *Machine has 8GB, Dom0 pinned at 512MB
> >
> > ??
> > As compared to 2.6.34 Kernel with backported patches, the load on the DomU is at least twice as high. It
> >
> > will be "close to normal" if I reduce the memory used to 4GB.
>
> That is in the dom0 or just in general on the machine?
> >
> > ??
> > As you can see from the attachment, you once had an idea. So should we try to find something...?
>
> I think that was to instrument swiotlb to give an idea of how
> often it is called and basically have a matrix of its load. And
> from there figure out if the issue is that:
>
>  1). The drivers allocoate/bounce/deallocate buffers on every interrupt
>     (bad, driver should be using some form of dma pool and most of the
>     ivtv do that)
>
>  2). The buffers allocated to the drivers are above the 4GB and we end
>     up bouncing it needlessly. That can happen if the dom0 has most of
>     the precious memory under 4GB. However, that is usually not the case
>     as the domain isusually allocated from the top of the memory. The
>     fix for that was to set dom0_mem=max:XX. .. but with Dom0 kernels
>     before 3.1, the parameter would be ignored, so you had to use
>     'mem=XX' on the Linux command line as well.
>
>  3). Where did you get the load values? Was it dom0? or domU?
>
>
>
> >
> > ??
> > Carsten.
> > ??
> > -----Urspr??ngliche Nachricht-----
> > An:konrad.wilk <konrad.wilk [at] oracle>;
> > CC:linux <linux [at] eikelenboom>; xen-devel <xen-devel [at] lists>;
> > Von:Carsten Schiers <carsten [at] schiers>
> > Gesendet:Mi 29.06.2011 23:17
> > Betreff:AW: Re: Re: Re: AW: Re: [Xen-devel] AW: Load increase after memory upgrade?
> > > Lets first do the c) experiment as that will likely explain your load average increase.
> > ...
> > > >c). If you want to see if the fault here lies in the bounce buffer
> > > being used more
> > > >often in the DomU b/c you have 8GB of memory now and you end up using
> > > more pages
> > > >past 4GB (in DomU), I can cook up a patch to figure this out. But an
> > > easier way is
> > > >to just do (on the Xen hypervisor line): mem=4G and that will make
> > > think you only have
> > > >4GB of physical RAM. ??If the load comes back to the normal "amount"
> > > then the likely
> > > >culprit is that and we can think on how to fix this.
> >
> > You are on the right track. Load was going down to "normal" 10% when reducing
> > Xen to 4GB by the parameter. Load seems to be still a little, little bit lower
> > with Xenified Kernel (8-9%), but this is drastically lower than the 20% we had
> > before.
>
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel [at] lists
> > http://lists.xensource.com/xen-devel
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xensource.com/xen-devel
>

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad.wilk at oracle

Nov 28, 2011, 8:45 AM

Post #9 of 66 (735 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Mon, Nov 28, 2011 at 03:40:13PM +0000, Ian Campbell wrote:
> On Mon, 2011-11-28 at 15:28 +0000, Konrad Rzeszutek Wilk wrote:
> > On Fri, Nov 25, 2011 at 11:11:55PM +0100, Carsten Schiers wrote:
>
> > > I looked through my old mails from you and you explained already the necessity of double
> > > bounce buffering (PCI->below 4GB->above 4GB). What I don't understand is: why does the
> > > Xenified kernel not have this kind of issue?
> >
> > That is a puzzle. It should not. The code is very much the same - both
> > use the generic SWIOTLB which has not changed for years.
>
> The swiotlb-xen used by classic-xen kernels (which I assume is what
> Carsten means by "Xenified") isn't exactly the same as the stuff in
> mainline Linux, it's been heavily refactored for one thing. It's not
> impossible that mainline is bouncing something it doesn't really need
> to.

The usage, at least with 'pci_alloc_coherent' is that there is no bouncing
being done. The alloc_coherent will allocate a nice page, underneath the 4GB
mark and give it to the driver. The driver can use it as it wishes and there
is no need to bounce buffer.

But I can't find the implementation of that in the classic Xen-SWIOTLB. It looks
as if it is using map_single which would be taking the memory out of the
pool for a very long time, instead of allocating memory and "swizzling" the MFNs.
[.Note, I looked at the 2.6.18 hg tree for classic, the 2.6.34 is probably
improved much better so let me check that]

Carsten, let me prep up a patch that will print some diagnostic information
during the runtime - to see how often it does the bounce, the usage, etc..

>
> It's also possible that the dma mask of the device is different/wrong in
> mainline leading to such additional bouncing.

If one were to use map_page and such - yes. But the alloc_coherent bypasses
that and ends up allocating it right under the 4GB (or rather it allocates
based on the dev->coherent_mask and swizzles the MFNs as required).

>
> I guess it's also possible that the classic-Xen kernels are playing fast
> and loose by not bouncing something they should (although if so they
> appear to be getting away with it...) or that there is some difference
> which really means mainline needs to bounce while classic-Xen doesn't.

<nods> Could be very well.
>
> Ian.
>

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


lersek at redhat

Nov 28, 2011, 8:58 AM

Post #10 of 66 (735 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On 11/28/11 16:40, Ian Campbell wrote:
> On Mon, 2011-11-28 at 15:28 +0000, Konrad Rzeszutek Wilk wrote:
>> On Fri, Nov 25, 2011 at 11:11:55PM +0100, Carsten Schiers wrote:
>
>>> I looked through my old mails from you and you explained already the necessity of double
>>> bounce buffering (PCI->below 4GB->above 4GB). What I don't understand is: why does the
>>> Xenified kernel not have this kind of issue?
>>
>> That is a puzzle. It should not. The code is very much the same - both
>> use the generic SWIOTLB which has not changed for years.
>
> The swiotlb-xen used by classic-xen kernels (which I assume is what
> Carsten means by "Xenified") isn't exactly the same as the stuff in
> mainline Linux, it's been heavily refactored for one thing. It's not
> impossible that mainline is bouncing something it doesn't really need
> to.

Please excuse me if I'm completely mistaken; my only point of reference
is that we recently had to backport
<http://xenbits.xensource.com/hg/linux-2.6.18-xen.hg/rev/940>.

> It's also possible that the dma mask of the device is different/wrong in
> mainline leading to such additional bouncing.

dma_alloc_coherent() -- which I guess is the precursor of
pci_alloc_consistent() -- asks xen_create_contiguous_region() to back
the vaddr range with frames machine-addressible inside the device's dma
mask. xen_create_contiguous_region() seems to land in a XENMEM_exchange
hypercall (among others). Perhaps this extra layer of indirection allows
the driver to use low pages directly, without bounce buffers.

> I guess it's also possible that the classic-Xen kernels are playing fast
> and loose by not bouncing something they should (although if so they
> appear to be getting away with it...) or that there is some difference
> which really means mainline needs to bounce while classic-Xen doesn't.

I'm sorry if what I just posted is painfully stupid. I'm taking the risk
for the 1% chance that it could be helpful.

Wrt. the idle time accounting problem, after Niall's two pings, I'm also
waiting for a verdict, and/or for myself finding the time and fishing
out the current patches.

Laszlo

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at suse

Nov 29, 2011, 12:31 AM

Post #11 of 66 (730 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

>>> On 28.11.11 at 17:45, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
> But I can't find the implementation of that in the classic Xen-SWIOTLB.

linux-2.6.18-xen.hg/arch/i386/kernel/pci-dma-xen.c:dma_alloc_coherent().

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


carsten at schiers

Nov 29, 2011, 1:31 AM

Post #12 of 66 (740 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

I attached the actualy used 2.6.34 file here, if that helps. BR,C.
 
-----Ursprüngliche Nachricht-----
An:Konrad Rzeszutek Wilk <konrad.wilk [at] oracle>;
CC:Ian Campbell <Ian.Campbell [at] citrix>; Konrad Rzeszutek Wilk <konrad [at] darnok>; xen-devel <xen-devel [at] lists>; zhenzhong.duan [at] oracle; lersek [at] redhat; Carsten Schiers <carsten [at] schiers>;
Von:Jan Beulich <JBeulich [at] suse>
Gesendet:Di 29.11.2011 09:52
Betreff:Re: [Xen-devel] Load increase after memory upgrade (part2)
>>> On 28.11.11 at 17:45, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
> But I can't find the implementation of that in the classic Xen-SWIOTLB.

linux-2.6.18-xen.hg/arch/i386/kernel/pci-dma-xen.c:dma_alloc_coherent().

Jan
Attachments: pci-dma-xen.c (9.38 KB)


carsten at schiers

Nov 29, 2011, 1:37 AM

Post #13 of 66 (735 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

The swiotlb-xen used by classic-xen kernels (which I assume is what
Carsten means by "Xenified") isn't exactly the same as the stuff in
mainline Linux, it's been heavily refactored for one thing. It's not
impossible that mainline is bouncing something it doesn't really need
to.

Yes, it's a 2.6.34 kernel with Andrew Lyon's backported patches found here:

 
  http://code.google.com/p/gentoo-xen-kernel/downloads/list

 
GrC.


carsten at schiers

Nov 29, 2011, 1:42 AM

Post #14 of 66 (735 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

 

>   - with load I mean the %CPU of xentop
>   - there is no change in CPU usage of the DomU or Dom0

Uhh, which matrix are using for that? CPU usage...? This is if you
change the DomU or the amount of memory the guest has? This is not
the load number (xentop value)?

I had a quick look into the munin plugin. It reads the output of "xm li", Time in seconds and normalizes it.
But the effect is also visible in the CPU(%) column of xentop, if the DomU is on higher load.

 
BR,C.


carsten at schiers

Nov 29, 2011, 1:46 AM

Post #15 of 66 (738 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

Carsten, let me prep up a patch that will print some diagnostic information
during the runtime - to see how often it does the bounce, the usage, etc..

 
Jup, looking forward to implementing it. I can include them into any kernel. 2.6.18 would be

a bit difficult though, as the driver pack isn't compatible any longer...so I'd prefer 2.6.34 Xenified

vs. 3.1.2 pvops.

 
BR,C.


Ian.Campbell at citrix

Nov 29, 2011, 2:23 AM

Post #16 of 66 (740 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Mon, 2011-11-28 at 16:45 +0000, Konrad Rzeszutek Wilk wrote:
> On Mon, Nov 28, 2011 at 03:40:13PM +0000, Ian Campbell wrote:
> > On Mon, 2011-11-28 at 15:28 +0000, Konrad Rzeszutek Wilk wrote:
> > > On Fri, Nov 25, 2011 at 11:11:55PM +0100, Carsten Schiers wrote:
> >
> > > > I looked through my old mails from you and you explained already the necessity of double
> > > > bounce buffering (PCI->below 4GB->above 4GB). What I don't understand is: why does the
> > > > Xenified kernel not have this kind of issue?
> > >
> > > That is a puzzle. It should not. The code is very much the same - both
> > > use the generic SWIOTLB which has not changed for years.
> >
> > The swiotlb-xen used by classic-xen kernels (which I assume is what
> > Carsten means by "Xenified") isn't exactly the same as the stuff in
> > mainline Linux, it's been heavily refactored for one thing. It's not
> > impossible that mainline is bouncing something it doesn't really need
> > to.
>
> The usage, at least with 'pci_alloc_coherent' is that there is no bouncing
> being done. The alloc_coherent will allocate a nice page, underneath the 4GB
> mark and give it to the driver. The driver can use it as it wishes and there
> is no need to bounce buffer.

Oh, I didn't realise dma_alloc_coherent was part of swiotlb now. Only a
subset of swiotlb is in use then, all the bouncing stuff _should_ be
idle/unused -- but has that been confirmed?

>
> But I can't find the implementation of that in the classic Xen-SWIOTLB. It looks
> as if it is using map_single which would be taking the memory out of the
> pool for a very long time, instead of allocating memory and "swizzling" the MFNs.
> [.Note, I looked at the 2.6.18 hg tree for classic, the 2.6.34 is probably
> improved much better so let me check that]
>
> Carsten, let me prep up a patch that will print some diagnostic information
> during the runtime - to see how often it does the bounce, the usage, etc..
>
> >
> > It's also possible that the dma mask of the device is different/wrong in
> > mainline leading to such additional bouncing.
>
> If one were to use map_page and such - yes. But the alloc_coherent bypasses
> that and ends up allocating it right under the 4GB (or rather it allocates
> based on the dev->coherent_mask and swizzles the MFNs as required).
>
> >
> > I guess it's also possible that the classic-Xen kernels are playing fast
> > and loose by not bouncing something they should (although if so they
> > appear to be getting away with it...) or that there is some difference
> > which really means mainline needs to bounce while classic-Xen doesn't.
>
> <nods> Could be very well.
> >
> > Ian.
> >



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad.wilk at oracle

Nov 29, 2011, 7:33 AM

Post #17 of 66 (739 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Tue, Nov 29, 2011 at 10:23:18AM +0000, Ian Campbell wrote:
> On Mon, 2011-11-28 at 16:45 +0000, Konrad Rzeszutek Wilk wrote:
> > On Mon, Nov 28, 2011 at 03:40:13PM +0000, Ian Campbell wrote:
> > > On Mon, 2011-11-28 at 15:28 +0000, Konrad Rzeszutek Wilk wrote:
> > > > On Fri, Nov 25, 2011 at 11:11:55PM +0100, Carsten Schiers wrote:
> > >
> > > > > I looked through my old mails from you and you explained already the necessity of double
> > > > > bounce buffering (PCI->below 4GB->above 4GB). What I don't understand is: why does the
> > > > > Xenified kernel not have this kind of issue?
> > > >
> > > > That is a puzzle. It should not. The code is very much the same - both
> > > > use the generic SWIOTLB which has not changed for years.
> > >
> > > The swiotlb-xen used by classic-xen kernels (which I assume is what
> > > Carsten means by "Xenified") isn't exactly the same as the stuff in
> > > mainline Linux, it's been heavily refactored for one thing. It's not
> > > impossible that mainline is bouncing something it doesn't really need
> > > to.
> >
> > The usage, at least with 'pci_alloc_coherent' is that there is no bouncing
> > being done. The alloc_coherent will allocate a nice page, underneath the 4GB
> > mark and give it to the driver. The driver can use it as it wishes and there
> > is no need to bounce buffer.
>
> Oh, I didn't realise dma_alloc_coherent was part of swiotlb now. Only a
> subset of swiotlb is in use then, all the bouncing stuff _should_ be
> idle/unused -- but has that been confirmed?

Nope. I hope that the diagnostic patch I have in mind will prove/disprove that.
Now I just need to find a moment to write it :-)

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad at darnok

Dec 2, 2011, 7:23 AM

Post #18 of 66 (719 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

> > > > > That is a puzzle. It should not. The code is very much the same - both
> > > > > use the generic SWIOTLB which has not changed for years.
> > > >
> > > > The swiotlb-xen used by classic-xen kernels (which I assume is what
> > > > Carsten means by "Xenified") isn't exactly the same as the stuff in
> > > > mainline Linux, it's been heavily refactored for one thing. It's not
> > > > impossible that mainline is bouncing something it doesn't really need
> > > > to.
> > >
> > > The usage, at least with 'pci_alloc_coherent' is that there is no bouncing
> > > being done. The alloc_coherent will allocate a nice page, underneath the 4GB
> > > mark and give it to the driver. The driver can use it as it wishes and there
> > > is no need to bounce buffer.
> >
> > Oh, I didn't realise dma_alloc_coherent was part of swiotlb now. Only a
> > subset of swiotlb is in use then, all the bouncing stuff _should_ be
> > idle/unused -- but has that been confirmed?
>
> Nope. I hope that the diagnostic patch I have in mind will prove/disprove that.
> Now I just need to find a moment to write it :-)

Done!

Carsten, can you please patch your kernel with this hacky patch and
when you have booted the new kernel, just do

modprobe dump_swiotlb

it should give an idea of how many bounces are happening, coherent
allocations, syncs, and so on.. along with the last driver that
did those operations.
Attachments: swiotlb-debug.patch (10.3 KB)


carsten at schiers

Dec 4, 2011, 3:59 AM

Post #19 of 66 (716 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

Thank you, Konrad.

I applied the patch to 3.1.2. In order to have a clear picture, I only enabled one PCI card.
The result is:

[ 28.028032] Starting SWIOTLB debug thread.
[ 28.028076] swiotlb_start_thread: Go!
[ 28.028622] xen_swiotlb_start_thread: Go!
[ 33.028153] 0 [budget_av 0000:00:00.0] bounce: from:555352(slow:0)to:0 map:329 unmap:0 sync:555352
[ 33.028294] SWIOTLB is 2% full
[ 38.028178] 0 budget_av 0000:00:00.0 alloc coherent: 4, free: 0
[ 38.028230] 0 [budget_av 0000:00:00.0] bounce: from:127981(slow:0)to:0 map:0 unmap:0 sync:127981
[ 38.028352] SWIOTLB is 2% full
[ 43.028170] 0 [budget_av 0000:00:00.0] bounce: from:128310(slow:0)to:0 map:0 unmap:0 sync:128310
[ 43.028310] SWIOTLB is 2% full
[ 48.028199] 0 [budget_av 0000:00:00.0] bounce: from:127981(slow:0)to:0 map:0 unmap:0 sync:127981
[ 48.028334] SWIOTLB is 2% full
[ 53.028170] 0 [budget_av 0000:00:00.0] bounce: from:128310(slow:0)to:0 map:0 unmap:0 sync:128310
[ 53.028309] SWIOTLB is 2% full
[ 58.028138] 0 [budget_av 0000:00:00.0] bounce: from:126994(slow:0)to:0 map:0 unmap:0 sync:126994
[ 58.028195] SWIOTLB is 2% full
[ 63.028170] 0 [budget_av 0000:00:00.0] bounce: from:121401(slow:0)to:0 map:0 unmap:0 sync:121401
[ 63.029560] SWIOTLB is 2% full
[ 68.028193] 0 [budget_av 0000:00:00.0] bounce: from:127981(slow:0)to:0 map:0 unmap:0 sync:127981
[ 68.028329] SWIOTLB is 2% full
[ 73.028104] 0 [budget_av 0000:00:00.0] bounce: from:122717(slow:0)to:0 map:0 unmap:0 sync:122717
[ 73.028244] SWIOTLB is 2% full
[ 78.028191] 0 [budget_av 0000:00:00.0] bounce: from:127981(slow:0)to:0 map:0 unmap:0 sync:127981
[ 78.028331] SWIOTLB is 2% full
[ 83.028112] 0 [budget_av 0000:00:00.0] bounce: from:128310(slow:0)to:0 map:0 unmap:0 sync:128310
[ 83.028171] SWIOTLB is 2% full

Was that long enough? I hope this helps.

Carsten.

-----Ursprngliche Nachricht-----
Von: Konrad Rzeszutek Wilk [mailto:konrad [at] darnok]
Gesendet: Freitag, 2. Dezember 2011 16:24
An: Konrad Rzeszutek Wilk
Cc: Ian Campbell; xen-devel; Carsten Schiers; zhenzhong.duan [at] oracle; lersek [at] redhat
Betreff: Re: [Xen-devel] Load increase after memory upgrade (part2)

> > > > > That is a puzzle. It should not. The code is very much the same - both
> > > > > use the generic SWIOTLB which has not changed for years.
> > > >
> > > > The swiotlb-xen used by classic-xen kernels (which I assume is what
> > > > Carsten means by "Xenified") isn't exactly the same as the stuff in
> > > > mainline Linux, it's been heavily refactored for one thing. It's not
> > > > impossible that mainline is bouncing something it doesn't really need
> > > > to.
> > >
> > > The usage, at least with 'pci_alloc_coherent' is that there is no bouncing
> > > being done. The alloc_coherent will allocate a nice page, underneath the 4GB
> > > mark and give it to the driver. The driver can use it as it wishes and there
> > > is no need to bounce buffer.
> >
> > Oh, I didn't realise dma_alloc_coherent was part of swiotlb now. Only a
> > subset of swiotlb is in use then, all the bouncing stuff _should_ be
> > idle/unused -- but has that been confirmed?
>
> Nope. I hope that the diagnostic patch I have in mind will prove/disprove that.
> Now I just need to find a moment to write it :-)

Done!

Carsten, can you please patch your kernel with this hacky patch and
when you have booted the new kernel, just do

modprobe dump_swiotlb

it should give an idea of how many bounces are happening, coherent
allocations, syncs, and so on.. along with the last driver that
did those operations.



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


carsten at schiers

Dec 4, 2011, 4:09 AM

Post #20 of 66 (715 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

Here with two cards enabled and creating a bit "work" by watching TV with one oft hem:

[ 23.842720] Starting SWIOTLB debug thread.
[ 23.842750] swiotlb_start_thread: Go!
[ 23.842838] xen_swiotlb_start_thread: Go!
[ 28.841451] 0 [budget_av 0000:00:01.0] bounce: from:435596(slow:0)to:0 map:658 unmap:0 sync:435596
[ 28.841592] SWIOTLB is 4% full
[ 33.840147] 0 [budget_av 0000:00:01.0] bounce: from:127652(slow:0)to:0 map:0 unmap:0 sync:127652
[ 33.840283] SWIOTLB is 4% full
[ 33.844222] 0 budget_av 0000:00:01.0 alloc coherent: 8, free: 0
[ 38.840227] 0 [budget_av 0000:00:01.0] bounce: from:128310(slow:0)to:0 map:0 unmap:0 sync:128310
[ 38.840361] SWIOTLB is 4% full
[ 43.840182] 0 [budget_av 0000:00:01.0] bounce: from:128310(slow:0)to:0 map:0 unmap:0 sync:128310
[ 43.840323] SWIOTLB is 4% full
[ 48.840094] 0 [budget_av 0000:00:01.0] bounce: from:127652(slow:0)to:0 map:0 unmap:0 sync:127652
[ 48.840154] SWIOTLB is 4% full
[ 53.840160] 0 [budget_av 0000:00:01.0] bounce: from:119756(slow:0)to:0 map:0 unmap:0 sync:119756
[ 53.840301] SWIOTLB is 4% full
[ 58.840202] 0 [budget_av 0000:00:01.0] bounce: from:128310(slow:0)to:0 map:0 unmap:0 sync:128310
[ 58.840339] SWIOTLB is 4% full
[ 63.840626] 0 [budget_av 0000:00:01.0] bounce: from:128310(slow:0)to:0 map:0 unmap:0 sync:128310
[ 63.840686] SWIOTLB is 4% full
[ 68.840122] 0 [budget_av 0000:00:01.0] bounce: from:127323(slow:0)to:0 map:0 unmap:0 sync:127323
[ 68.840180] SWIOTLB is 4% full
[ 73.840647] 0 [budget_av 0000:00:01.0] bounce: from:211547(slow:0)to:0 map:0 unmap:0 sync:211547
[ 73.840784] SWIOTLB is 4% full
[ 78.840204] 0 [budget_av 0000:00:01.0] bounce: from:255962(slow:0)to:0 map:0 unmap:0 sync:255962
[ 78.840344] SWIOTLB is 4% full
[ 83.840114] 0 [budget_av 0000:00:01.0] bounce: from:255304(slow:0)to:0 map:0 unmap:0 sync:255304
[ 83.840178] SWIOTLB is 4% full
[ 88.840158] 0 [budget_av 0000:00:01.0] bounce: from:256620(slow:0)to:0 map:0 unmap:0 sync:256620
[ 88.840302] SWIOTLB is 4% full
[ 93.840185] 0 [budget_av 0000:00:00.0] bounce: from:250040(slow:0)to:0 map:0 unmap:0 sync:250040
[ 93.840319] SWIOTLB is 4% full
[ 98.840181] 0 [budget_av 0000:00:00.0] bounce: from:255962(slow:0)to:0 map:0 unmap:0 sync:255962
[ 98.841563] SWIOTLB is 4% full
[ 103.841221] 0 [budget_av 0000:00:00.0] bounce: from:255962(slow:0)to:0 map:0 unmap:0 sync:255962
[ 103.841361] SWIOTLB is 4% full
[ 108.840247] 0 [budget_av 0000:00:00.0] bounce: from:255962(slow:0)to:0 map:0 unmap:0 sync:255962
[ 108.840389] SWIOTLB is 4% full
[ 113.840157] 0 [budget_av 0000:00:00.0] bounce: from:261555(slow:0)to:0 map:0 unmap:0 sync:261555
[ 113.840298] SWIOTLB is 4% full
[ 118.840119] 0 [budget_av 0000:00:00.0] bounce: from:295442(slow:0)to:0 map:0 unmap:0 sync:295442
[ 118.840259] SWIOTLB is 4% full
[ 123.841025] 0 [budget_av 0000:00:00.0] bounce: from:295113(slow:0)to:0 map:0 unmap:0 sync:295113
[ 123.841164] SWIOTLB is 4% full
[ 128.840175] 0 [budget_av 0000:00:00.0] bounce: from:294784(slow:0)to:0 map:0 unmap:0 sync:294784
[ 128.840310] SWIOTLB is 4% full
[ 133.840194] 0 [budget_av 0000:00:00.0] bounce: from:293797(slow:0)to:0 map:0 unmap:0 sync:293797
[ 133.840330] SWIOTLB is 4% full
[ 138.840498] 0 [budget_av 0000:00:00.0] bounce: from:341502(slow:0)to:0 map:0 unmap:0 sync:341502
[ 138.840637] SWIOTLB is 4% full
[ 143.840173] 0 [budget_av 0000:00:00.0] bounce: from:341502(slow:0)to:0 map:0 unmap:0 sync:341502
[ 143.840313] SWIOTLB is 4% full
[ 148.840215] 0 [budget_av 0000:00:00.0] bounce: from:341831(slow:0)to:0 map:0 unmap:0 sync:341831
[ 148.840355] SWIOTLB is 4% full
[ 153.840205] 0 [budget_av 0000:00:01.0] bounce: from:329658(slow:0)to:0 map:0 unmap:0 sync:329658
[ 153.840341] SWIOTLB is 4% full
[ 158.840137] 0 [budget_av 0000:00:00.0] bounce: from:342160(slow:0)to:0 map:0 unmap:0 sync:342160
[ 158.840277] SWIOTLB is 4% full
[ 163.841288] 0 [budget_av 0000:00:00.0] bounce: from:341502(slow:0)to:0 map:0 unmap:0 sync:341502
[ 163.841424] SWIOTLB is 4% full
[ 168.840198] 0 [budget_av 0000:00:00.0] bounce: from:341502(slow:0)to:0 map:0 unmap:0 sync:341502
[ 168.840339] SWIOTLB is 4% full
[ 173.840167] 0 [budget_av 0000:00:00.0] bounce: from:341502(slow:0)to:0 map:0 unmap:0 sync:341502
[ 173.840304] SWIOTLB is 4% full
[ 178.840184] 0 [budget_av 0000:00:00.0] bounce: from:328013(slow:0)to:0 map:0 unmap:0 sync:328013
[ 178.840324] SWIOTLB is 4% full
[ 183.840129] 0 [budget_av 0000:00:00.0] bounce: from:341831(slow:0)to:0 map:0 unmap:0 sync:341831
[ 183.840269] SWIOTLB is 4% full
[ 188.840123] 0 [budget_av 0000:00:01.0] bounce: from:340515(slow:0)to:0 map:0 unmap:0 sync:340515
[ 188.841647] SWIOTLB is 4% full
[ 193.840192] 0 [budget_av 0000:00:00.0] bounce: from:338541(slow:0)to:0 map:0 unmap:0 sync:338541
[ 193.840329] SWIOTLB is 4% full
[ 198.840148] 0 [budget_av 0000:00:01.0] bounce: from:330316(slow:0)to:0 map:0 unmap:0 sync:330316
[ 198.840230] SWIOTLB is 4% full
[ 203.840860] 0 [budget_av 0000:00:00.0] bounce: from:341831(slow:0)to:0 map:0 unmap:0 sync:341831
[ 203.841000] SWIOTLB is 4% full
[ 208.840562] 0 [budget_av 0000:00:01.0] bounce: from:337883(slow:0)to:0 map:0 unmap:0 sync:337883
[ 208.840698] SWIOTLB is 4% full
[ 213.840171] 0 [budget_av 0000:00:00.0] bounce: from:341502(slow:0)to:0 map:0 unmap:0 sync:341502
[ 213.840311] SWIOTLB is 4% full
[ 218.840214] 0 [budget_av 0000:00:01.0] bounce: from:320117(slow:0)to:0 map:0 unmap:0 sync:320117
[ 218.840354] SWIOTLB is 4% full
[ 223.840238] 0 [budget_av 0000:00:01.0] bounce: from:299390(slow:0)to:0 map:0 unmap:0 sync:299390
[ 223.840373] SWIOTLB is 4% full
[ 228.841415] 0 [budget_av 0000:00:01.0] bounce: from:298732(slow:0)to:0 map:0 unmap:0 sync:298732
[ 228.841560] SWIOTLB is 4% full
[ 233.840705] 0 [budget_av 0000:00:00.0] bounce: from:299061(slow:0)to:0 map:0 unmap:0 sync:299061
[ 233.840844] SWIOTLB is 4% full
[ 238.840145] 0 [budget_av 0000:00:01.0] bounce: from:293468(slow:0)to:0 map:0 unmap:0 sync:293468
[ 238.840280] SWIOTLB is 4% full

-----Ursprngliche Nachricht-----
Von: Konrad Rzeszutek Wilk [mailto:konrad [at] darnok]
Gesendet: Freitag, 2. Dezember 2011 16:24
An: Konrad Rzeszutek Wilk
Cc: Ian Campbell; xen-devel; Carsten Schiers; zhenzhong.duan [at] oracle; lersek [at] redhat
Betreff: Re: [Xen-devel] Load increase after memory upgrade (part2)

> > > > > That is a puzzle. It should not. The code is very much the same - both
> > > > > use the generic SWIOTLB which has not changed for years.
> > > >
> > > > The swiotlb-xen used by classic-xen kernels (which I assume is what
> > > > Carsten means by "Xenified") isn't exactly the same as the stuff in
> > > > mainline Linux, it's been heavily refactored for one thing. It's not
> > > > impossible that mainline is bouncing something it doesn't really need
> > > > to.
> > >
> > > The usage, at least with 'pci_alloc_coherent' is that there is no bouncing
> > > being done. The alloc_coherent will allocate a nice page, underneath the 4GB
> > > mark and give it to the driver. The driver can use it as it wishes and there
> > > is no need to bounce buffer.
> >
> > Oh, I didn't realise dma_alloc_coherent was part of swiotlb now. Only a
> > subset of swiotlb is in use then, all the bouncing stuff _should_ be
> > idle/unused -- but has that been confirmed?
>
> Nope. I hope that the diagnostic patch I have in mind will prove/disprove that.
> Now I just need to find a moment to write it :-)

Done!

Carsten, can you please patch your kernel with this hacky patch and
when you have booted the new kernel, just do

modprobe dump_swiotlb

it should give an idea of how many bounces are happening, coherent
allocations, syncs, and so on.. along with the last driver that
did those operations.



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


carsten at schiers

Dec 4, 2011, 4:18 AM

Post #21 of 66 (716 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

Should eventually mention that I create the DomU with only the parameter iommu=soft. I hope
Nothing more is required. For Xenified, it's swiotlb=32,force.

Carsten.

-----Ursprngliche Nachricht-----
Von: Konrad Rzeszutek Wilk [mailto:konrad [at] darnok]
Gesendet: Freitag, 2. Dezember 2011 16:24
An: Konrad Rzeszutek Wilk
Cc: Ian Campbell; xen-devel; Carsten Schiers; zhenzhong.duan [at] oracle; lersek [at] redhat
Betreff: Re: [Xen-devel] Load increase after memory upgrade (part2)

> > > > > That is a puzzle. It should not. The code is very much the same - both
> > > > > use the generic SWIOTLB which has not changed for years.
> > > >
> > > > The swiotlb-xen used by classic-xen kernels (which I assume is what
> > > > Carsten means by "Xenified") isn't exactly the same as the stuff in
> > > > mainline Linux, it's been heavily refactored for one thing. It's not
> > > > impossible that mainline is bouncing something it doesn't really need
> > > > to.
> > >
> > > The usage, at least with 'pci_alloc_coherent' is that there is no bouncing
> > > being done. The alloc_coherent will allocate a nice page, underneath the 4GB
> > > mark and give it to the driver. The driver can use it as it wishes and there
> > > is no need to bounce buffer.
> >
> > Oh, I didn't realise dma_alloc_coherent was part of swiotlb now. Only a
> > subset of swiotlb is in use then, all the bouncing stuff _should_ be
> > idle/unused -- but has that been confirmed?
>
> Nope. I hope that the diagnostic patch I have in mind will prove/disprove that.
> Now I just need to find a moment to write it :-)

Done!

Carsten, can you please patch your kernel with this hacky patch and
when you have booted the new kernel, just do

modprobe dump_swiotlb

it should give an idea of how many bounces are happening, coherent
allocations, syncs, and so on.. along with the last driver that
did those operations.



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad.wilk at oracle

Dec 5, 2011, 7:26 PM

Post #22 of 66 (721 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Sun, Dec 04, 2011 at 01:09:28PM +0100, Carsten Schiers wrote:
> Here with two cards enabled and creating a bit "work" by watching TV with one oft hem:
>
> [ 23.842720] Starting SWIOTLB debug thread.
> [ 23.842750] swiotlb_start_thread: Go!
> [ 23.842838] xen_swiotlb_start_thread: Go!
> [ 28.841451] 0 [budget_av 0000:00:01.0] bounce: from:435596(slow:0)to:0 map:658 unmap:0 sync:435596
> [ 28.841592] SWIOTLB is 4% full
> [ 33.840147] 0 [budget_av 0000:00:01.0] bounce: from:127652(slow:0)to:0 map:0 unmap:0 sync:127652
> [ 33.840283] SWIOTLB is 4% full
> [ 33.844222] 0 budget_av 0000:00:01.0 alloc coherent: 8, free: 0
> [ 38.840227] 0 [budget_av 0000:00:01.0] bounce: from:128310(slow:0)to:0 map:0 unmap:0 sync:128310

Whoa. Yes. You are definitly using the bounce buffer :-)

Now it is time to look at why the drive is not using those coherent ones - it
looks to allocate just eight of them but does not use them.. Unless it is
using them _and_ bouncing them (which would be odd).

And BTW, you can lower your 'swiotlb=XX' value. The 4% is how much you
are using of the default size.

I should find out_why_ the old Xen kernels do not use the bounce buffer
so much...


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad at darnok

Dec 14, 2011, 12:23 PM

Post #23 of 66 (710 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Mon, Dec 05, 2011 at 10:26:21PM -0500, Konrad Rzeszutek Wilk wrote:
> On Sun, Dec 04, 2011 at 01:09:28PM +0100, Carsten Schiers wrote:
> > Here with two cards enabled and creating a bit "work" by watching TV with one oft hem:
> >
> > [ 23.842720] Starting SWIOTLB debug thread.
> > [ 23.842750] swiotlb_start_thread: Go!
> > [ 23.842838] xen_swiotlb_start_thread: Go!
> > [ 28.841451] 0 [budget_av 0000:00:01.0] bounce: from:435596(slow:0)to:0 map:658 unmap:0 sync:435596
> > [ 28.841592] SWIOTLB is 4% full
> > [ 33.840147] 0 [budget_av 0000:00:01.0] bounce: from:127652(slow:0)to:0 map:0 unmap:0 sync:127652
> > [ 33.840283] SWIOTLB is 4% full
> > [ 33.844222] 0 budget_av 0000:00:01.0 alloc coherent: 8, free: 0
> > [ 38.840227] 0 [budget_av 0000:00:01.0] bounce: from:128310(slow:0)to:0 map:0 unmap:0 sync:128310
>
> Whoa. Yes. You are definitly using the bounce buffer :-)
>
> Now it is time to look at why the drive is not using those coherent ones - it
> looks to allocate just eight of them but does not use them.. Unless it is
> using them _and_ bouncing them (which would be odd).
>
> And BTW, you can lower your 'swiotlb=XX' value. The 4% is how much you
> are using of the default size.

So I able to see this with an atl1c ethernet driver on my SandyBridge i3
box. It looks as if the card is truly 32-bit so on a box with 8GB it
bounces the data. If I booted the Xen hypervisor with 'mem=4GB' I get no
bounces (no surprise there).

In other words - I see the same behavior you are seeing. Now off to:
>
> I should find out_why_ the old Xen kernels do not use the bounce buffer
> so much...

which will require some fiddling around.

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad.wilk at oracle

Dec 14, 2011, 2:07 PM

Post #24 of 66 (714 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

On Wed, Dec 14, 2011 at 04:23:51PM -0400, Konrad Rzeszutek Wilk wrote:
> On Mon, Dec 05, 2011 at 10:26:21PM -0500, Konrad Rzeszutek Wilk wrote:
> > On Sun, Dec 04, 2011 at 01:09:28PM +0100, Carsten Schiers wrote:
> > > Here with two cards enabled and creating a bit "work" by watching TV with one oft hem:
> > >
> > > [ 23.842720] Starting SWIOTLB debug thread.
> > > [ 23.842750] swiotlb_start_thread: Go!
> > > [ 23.842838] xen_swiotlb_start_thread: Go!
> > > [ 28.841451] 0 [budget_av 0000:00:01.0] bounce: from:435596(slow:0)to:0 map:658 unmap:0 sync:435596
> > > [ 28.841592] SWIOTLB is 4% full
> > > [ 33.840147] 0 [budget_av 0000:00:01.0] bounce: from:127652(slow:0)to:0 map:0 unmap:0 sync:127652
> > > [ 33.840283] SWIOTLB is 4% full
> > > [ 33.844222] 0 budget_av 0000:00:01.0 alloc coherent: 8, free: 0
> > > [ 38.840227] 0 [budget_av 0000:00:01.0] bounce: from:128310(slow:0)to:0 map:0 unmap:0 sync:128310
> >
> > Whoa. Yes. You are definitly using the bounce buffer :-)
> >
> > Now it is time to look at why the drive is not using those coherent ones - it
> > looks to allocate just eight of them but does not use them.. Unless it is
> > using them _and_ bouncing them (which would be odd).
> >
> > And BTW, you can lower your 'swiotlb=XX' value. The 4% is how much you
> > are using of the default size.
>
> So I able to see this with an atl1c ethernet driver on my SandyBridge i3
> box. It looks as if the card is truly 32-bit so on a box with 8GB it
> bounces the data. If I booted the Xen hypervisor with 'mem=4GB' I get no
> bounces (no surprise there).
>
> In other words - I see the same behavior you are seeing. Now off to:
> >
> > I should find out_why_ the old Xen kernels do not use the bounce buffer
> > so much...
>
> which will require some fiddling around.

And I am not seeing any difference - the swiotlb is used with the same usage when
booting a classic (old style XEnoLinux) 2.6.32 vs using a brand new pvops (3.2).
Obviously if I limit the physical amount of memory (so 'mem=4GB' on Xen hypervisor
line), the bounce usage disappears. Hmm, I wonder if there is a nice way to
tell the hypervisor - hey, please stuff dom0 under 4GB.

Here is the patch I used against classic XenLinux. Any chance you could run
it with your classis guests and see what numbers you get?
Attachments: swiotlb-against-old-type.patch (7.61 KB)


carsten at schiers

Dec 15, 2011, 6:52 AM

Post #25 of 66 (709 views)
Permalink
Re: Load increase after memory upgrade (part2) [In reply to]

...

> which will require some fiddling around.

Here is the patch I used against classic XenLinux. Any chance you could run
it with your classis guests and see what numbers you get?

Sure, it might take a bit, but I'll try it with my 2.6.34 classic kernel.

 
Carsten.

First page Previous page 1 2 3 Next page Last page  View All Xen devel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.