Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Xen: Devel

lock in vhpet

 

 

First page Previous page 1 2 Next page Last page  View All Xen devel RSS feed   Index | Next | Previous | View Threaded


yang.z.zhang at intel

Apr 16, 2012, 8:26 PM

Post #1 of 45 (457 views)
Permalink
lock in vhpet

Hi keir

I noticed that the changeset 15289 introuduced locking to platform timers. And you mentioned that it only handy for correctness. Are there some potential issues which is fixed by this patch? If not, I wonder why we need those locks? I think it should be OS's responsibly to guarantee the access sequentially, not hypervisor. Am I right?
I don't know whether all those locks are necessary, but at least the lock for vhpet, especially the reading lock, is not required.

best regards
yang


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


keir at xen

Apr 17, 2012, 12:27 AM

Post #2 of 45 (450 views)
Permalink
Re: lock in vhpet [In reply to]

On 17/04/2012 04:26, "Zhang, Yang Z" <yang.z.zhang [at] intel> wrote:

> Hi keir
>
> I noticed that the changeset 15289 introuduced locking to platform timers. And
> you mentioned that it only handy for correctness. Are there some potential
> issues which is fixed by this patch? If not, I wonder why we need those locks?

Yes, issues were fixed by the patch. That's why I bothered to implement it.
However I think the observed issues were with protecting the mechanisms in
vpt.c, and the other locking at least partially may be overly cautious.

> I think it should be OS's responsibly to guarantee the access sequentially,
> not hypervisor. Am I right?

It depends. Where an access is an apparently-atomic memory-mapped access,
but implemented as a sequence of operations in the hypervisor, the
hypervisor might need to maintain atomicity through locking.

> I don't know whether all those locks are necessary, but at least the lock for
> vhpet, especially the reading lock, is not required.

This is definitely not true, for example locking is required around calls to
create_periodic_time(), to serialise them. So in general the locking I
added, even in vhpet.c is required. If you have a specific hot path you are
looking to optimise, and especially if you have numbers to back that up,
then we can consider specific localised optimisations to avoid locking where
we can reason it is not needed.

-- Keir

> best regards
> yang
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


yang.z.zhang at intel

Apr 17, 2012, 5:52 PM

Post #3 of 45 (451 views)
Permalink
Re: lock in vhpet [In reply to]

> -----Original Message-----
> From: Keir Fraser [mailto:keir.xen [at] gmail] On Behalf Of Keir Fraser
>
> > Hi keir
> >
> > I noticed that the changeset 15289 introuduced locking to platform
> > timers. And you mentioned that it only handy for correctness. Are
> > there some potential issues which is fixed by this patch? If not, I wonder why
> we need those locks?
>
> Yes, issues were fixed by the patch. That's why I bothered to implement it.
> However I think the observed issues were with protecting the mechanisms in
> vpt.c, and the other locking at least partially may be overly cautious.
>
> > I think it should be OS's responsibly to guarantee the access
> > sequentially, not hypervisor. Am I right?
>
> It depends. Where an access is an apparently-atomic memory-mapped access,
> but implemented as a sequence of operations in the hypervisor, the hypervisor
> might need to maintain atomicity through locking.

But if there already has lock inside guest for those access, and there have no other code patch(like timer callback function) in hypervisor to access the shared data, then we don't need to use lock in hypersivor.

> > I don't know whether all those locks are necessary, but at least the
> > lock for vhpet, especially the reading lock, is not required.
>
> This is definitely not true, for example locking is required around calls to
> create_periodic_time(), to serialise them. So in general the locking I added,
> even in vhpet.c is required. If you have a specific hot path you are looking to
> optimise, and especially if you have numbers to back that up, then we can
> consider specific localised optimisations to avoid locking where we can reason
> it is not needed.
As I mentioned, if the guest can ensure there only one CPU to access the hpet at same time, this means the access itself already is serialized.
Yes.Win8 is booting very slowly w/ more than 16 VCPUs. And this is due to the big lock in reading hpet. Also, the xentrace data shows lots of vmexit which mainly from PAUSE instruction from other cpus. So I think if guest already uses lock to protect the hpet access, why hypervisor do the same thing too?

yang

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


keir.xen at gmail

Apr 18, 2012, 12:13 AM

Post #4 of 45 (450 views)
Permalink
Re: lock in vhpet [In reply to]

On 18/04/2012 01:52, "Zhang, Yang Z" <yang.z.zhang [at] intel> wrote:

>> It depends. Where an access is an apparently-atomic memory-mapped access,
>> but implemented as a sequence of operations in the hypervisor, the hypervisor
>> might need to maintain atomicity through locking.
>
> But if there already has lock inside guest for those access, and there have no
> other code patch(like timer callback function) in hypervisor to access the
> shared data, then we don't need to use lock in hypersivor.

If there is a memory-mapped register access which is atomic on bare metal,
the guest may not bother with locking. We have to maintain that apparent
atomicity ourselves by implementing serialisation in the hypervisor.

>>> I don't know whether all those locks are necessary, but at least the
>>> lock for vhpet, especially the reading lock, is not required.
>>
>> This is definitely not true, for example locking is required around calls to
>> create_periodic_time(), to serialise them. So in general the locking I added,
>> even in vhpet.c is required. If you have a specific hot path you are looking
>> to
>> optimise, and especially if you have numbers to back that up, then we can
>> consider specific localised optimisations to avoid locking where we can
>> reason
>> it is not needed.
> As I mentioned, if the guest can ensure there only one CPU to access the hpet
> at same time, this means the access itself already is serialized.
> Yes.Win8 is booting very slowly w/ more than 16 VCPUs. And this is due to the
> big lock in reading hpet. Also, the xentrace data shows lots of vmexit which
> mainly from PAUSE instruction from other cpus. So I think if guest already
> uses lock to protect the hpet access, why hypervisor do the same thing too?

If the HPET accesses are atomic on bare metal, we have to maintain that,
even if some guests have extra locking themselves. Also, in some cases Xen
needs locking to correctly maintain its own internal state regardless of
what an (untrusted) guest might do. So we cannot just get rid of the vhpet
lock everywhere. It's definitely not correct to do so. Optimising the hpet
read path however, sounds okay. I agree the lock may not be needed on that
specific path.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


yang.z.zhang at intel

Apr 18, 2012, 12:55 AM

Post #5 of 45 (451 views)
Permalink
Re: lock in vhpet [In reply to]

> -----Original Message-----
> From: Keir Fraser [mailto:keir.xen [at] gmail]
> Sent: Wednesday, April 18, 2012 3:13 PM
> To: Zhang, Yang Z; xen-devel [at] lists
> Subject: Re: lock in vhpet
>
> On 18/04/2012 01:52, "Zhang, Yang Z" <yang.z.zhang [at] intel> wrote:
>
> >> It depends. Where an access is an apparently-atomic memory-mapped
> >> access, but implemented as a sequence of operations in the
> >> hypervisor, the hypervisor might need to maintain atomicity through locking.
> >
> > But if there already has lock inside guest for those access, and there
> > have no other code patch(like timer callback function) in hypervisor
> > to access the shared data, then we don't need to use lock in hypersivor.
>
> If there is a memory-mapped register access which is atomic on bare metal,
> the guest may not bother with locking. We have to maintain that apparent
> atomicity ourselves by implementing serialisation in the hypervisor.
>
> >>> I don't know whether all those locks are necessary, but at least the
> >>> lock for vhpet, especially the reading lock, is not required.
> >>
> >> This is definitely not true, for example locking is required around
> >> calls to create_periodic_time(), to serialise them. So in general the
> >> locking I added, even in vhpet.c is required. If you have a specific
> >> hot path you are looking to optimise, and especially if you have
> >> numbers to back that up, then we can consider specific localised
> >> optimisations to avoid locking where we can reason it is not needed.
> > As I mentioned, if the guest can ensure there only one CPU to access
> > the hpet at same time, this means the access itself already is serialized.
> > Yes.Win8 is booting very slowly w/ more than 16 VCPUs. And this is due
> > to the big lock in reading hpet. Also, the xentrace data shows lots of
> > vmexit which mainly from PAUSE instruction from other cpus. So I think
> > if guest already uses lock to protect the hpet access, why hypervisor do the
> same thing too?
>
> If the HPET accesses are atomic on bare metal, we have to maintain that, even
> if some guests have extra locking themselves. Also, in some cases Xen needs
> locking to correctly maintain its own internal state regardless of what an
> (untrusted) guest might do. So we cannot just get rid of the vhpet lock
> everywhere. It's definitely not correct to do so. Optimising the hpet read path
> however, sounds okay. I agree the lock may not be needed on that specific
> path.

You are right.
For this case, since the main access of hpet is to read the main counter, so I think the rwlock is a better choice.

yang


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


keir at xen

Apr 18, 2012, 1:29 AM

Post #6 of 45 (450 views)
Permalink
Re: lock in vhpet [In reply to]

On 18/04/2012 08:55, "Zhang, Yang Z" <yang.z.zhang [at] intel> wrote:

>> If the HPET accesses are atomic on bare metal, we have to maintain that, even
>> if some guests have extra locking themselves. Also, in some cases Xen needs
>> locking to correctly maintain its own internal state regardless of what an
>> (untrusted) guest might do. So we cannot just get rid of the vhpet lock
>> everywhere. It's definitely not correct to do so. Optimising the hpet read
>> path
>> however, sounds okay. I agree the lock may not be needed on that specific
>> path.
>
> You are right.
> For this case, since the main access of hpet is to read the main counter, so I
> think the rwlock is a better choice.

I'll see if I can make a patch...

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


keir at xen

Apr 18, 2012, 2:14 AM

Post #7 of 45 (447 views)
Permalink
Re: lock in vhpet [In reply to]

On 18/04/2012 09:29, "Keir Fraser" <keir [at] xen> wrote:

> On 18/04/2012 08:55, "Zhang, Yang Z" <yang.z.zhang [at] intel> wrote:
>
>>> If the HPET accesses are atomic on bare metal, we have to maintain that,
>>> even
>>> if some guests have extra locking themselves. Also, in some cases Xen needs
>>> locking to correctly maintain its own internal state regardless of what an
>>> (untrusted) guest might do. So we cannot just get rid of the vhpet lock
>>> everywhere. It's definitely not correct to do so. Optimising the hpet read
>>> path
>>> however, sounds okay. I agree the lock may not be needed on that specific
>>> path.
>>
>> You are right.
>> For this case, since the main access of hpet is to read the main counter, so
>> I
>> think the rwlock is a better choice.
>
> I'll see if I can make a patch...

Please try the attached patch (build tested only).

-- Keir

> -- Keir
>
>
Attachments: 00-hpet-lockfree (5.03 KB)


keir at xen

Apr 18, 2012, 2:30 AM

Post #8 of 45 (449 views)
Permalink
Re: lock in vhpet [In reply to]

On 18/04/2012 10:14, "Keir Fraser" <keir [at] xen> wrote:

> On 18/04/2012 09:29, "Keir Fraser" <keir [at] xen> wrote:
>
>> On 18/04/2012 08:55, "Zhang, Yang Z" <yang.z.zhang [at] intel> wrote:
>>
>>>> If the HPET accesses are atomic on bare metal, we have to maintain that,
>>>> even
>>>> if some guests have extra locking themselves. Also, in some cases Xen needs
>>>> locking to correctly maintain its own internal state regardless of what an
>>>> (untrusted) guest might do. So we cannot just get rid of the vhpet lock
>>>> everywhere. It's definitely not correct to do so. Optimising the hpet read
>>>> path
>>>> however, sounds okay. I agree the lock may not be needed on that specific
>>>> path.
>>>
>>> You are right.
>>> For this case, since the main access of hpet is to read the main counter, so
>>> I
>>> think the rwlock is a better choice.
>>
>> I'll see if I can make a patch...
>
> Please try the attached patch (build tested only).

Actually try this updated one. :-)

> -- Keir
>
>> -- Keir
>>
>>
>
Attachments: 00-hpet-lockfree-v2 (5.46 KB)


yang.z.zhang at intel

Apr 18, 2012, 10:19 PM

Post #9 of 45 (448 views)
Permalink
Re: lock in vhpet [In reply to]

There have no problem with this patch, it works well. But it cannot fix the win8 issue. It seems there has some other issues with hpet. I will look into it.
Thanks for your quick patch.

best regards
yang


> -----Original Message-----
> From: Keir Fraser [mailto:keir.xen [at] gmail] On Behalf Of Keir Fraser
> Sent: Wednesday, April 18, 2012 5:31 PM
> To: Zhang, Yang Z; xen-devel [at] lists
> Subject: Re: lock in vhpet
>
> On 18/04/2012 10:14, "Keir Fraser" <keir [at] xen> wrote:
>
> > On 18/04/2012 09:29, "Keir Fraser" <keir [at] xen> wrote:
> >
> >> On 18/04/2012 08:55, "Zhang, Yang Z" <yang.z.zhang [at] intel> wrote:
> >>
> >>>> If the HPET accesses are atomic on bare metal, we have to maintain
> >>>> that, even if some guests have extra locking themselves. Also, in
> >>>> some cases Xen needs locking to correctly maintain its own internal
> >>>> state regardless of what an
> >>>> (untrusted) guest might do. So we cannot just get rid of the vhpet
> >>>> lock everywhere. It's definitely not correct to do so. Optimising
> >>>> the hpet read path however, sounds okay. I agree the lock may not
> >>>> be needed on that specific path.
> >>>
> >>> You are right.
> >>> For this case, since the main access of hpet is to read the main
> >>> counter, so I think the rwlock is a better choice.
> >>
> >> I'll see if I can make a patch...
> >
> > Please try the attached patch (build tested only).
>
> Actually try this updated one. :-)
>
> > -- Keir
> >
> >> -- Keir
> >>
> >>
> >


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


tim at xen

Apr 19, 2012, 1:27 AM

Post #10 of 45 (447 views)
Permalink
Re: lock in vhpet [In reply to]

At 05:19 +0000 on 19 Apr (1334812779), Zhang, Yang Z wrote:
> There have no problem with this patch, it works well. But it cannot
> fix the win8 issue. It seems there has some other issues with hpet. I
> will look into it. Thanks for your quick patch.

The lock in hvm_get_guest_time() will still be serializing the hpet
reads. But since it needs to update a shared variable, that will need to
haul cachelines around anyway.

Tim.


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


keir at xen

Apr 19, 2012, 1:47 AM

Post #11 of 45 (448 views)
Permalink
Re: lock in vhpet [In reply to]

On 19/04/2012 09:27, "Tim Deegan" <tim [at] xen> wrote:

> At 05:19 +0000 on 19 Apr (1334812779), Zhang, Yang Z wrote:
>> There have no problem with this patch, it works well. But it cannot
>> fix the win8 issue. It seems there has some other issues with hpet. I
>> will look into it. Thanks for your quick patch.
>
> The lock in hvm_get_guest_time() will still be serializing the hpet
> reads. But since it needs to update a shared variable, that will need to
> haul cachelines around anyway.

Yes, that's true. You could try the attached hacky patch out of interest, to
see what that lock is costing you in your scenario. But if we want
consistent monotonically-increasing guest time, I suspect we can't get rid
of the lock, so that's going to limit our scalability unavoidably. Shame.

-- Keir

> Tim.
>
Attachments: 00-lockfree-hvm-time (0.62 KB)


yang.z.zhang at intel

Apr 23, 2012, 12:36 AM

Post #12 of 45 (438 views)
Permalink
Re: lock in vhpet [In reply to]

The p2m lock in __get_gfn_type_access() is the culprit. Here is the profiling data with 10 seconds:

(XEN) p2m_lock 1 lock:
(XEN) lock: 190733(00000000:14CE5726), block: 67159(00000007:6AAA53F3)

Those data is collected when win8 guest(16 vcpus) is idle. 16 VCPUs blocked 30 seconds with 10 sec's profiling. It means 18% of cpu cycle is waiting for the p2m lock. And those data only for idle guest. The impaction is more seriously when run some workload inside guest.
I noticed that this change was adding by cs 24770. And before it, we don't require the p2m lock in _get_gfn_type_access. So is this lock really necessary?

best regards
yang


> -----Original Message-----
> From: Keir Fraser [mailto:keir.xen [at] gmail] On Behalf Of Keir Fraser
> Sent: Thursday, April 19, 2012 4:47 PM
> To: Tim Deegan; Zhang, Yang Z
> Cc: xen-devel [at] lists
> Subject: Re: [Xen-devel] lock in vhpet
>
> On 19/04/2012 09:27, "Tim Deegan" <tim [at] xen> wrote:
>
> > At 05:19 +0000 on 19 Apr (1334812779), Zhang, Yang Z wrote:
> >> There have no problem with this patch, it works well. But it cannot
> >> fix the win8 issue. It seems there has some other issues with hpet. I
> >> will look into it. Thanks for your quick patch.
> >
> > The lock in hvm_get_guest_time() will still be serializing the hpet
> > reads. But since it needs to update a shared variable, that will need
> > to haul cachelines around anyway.
>
> Yes, that's true. You could try the attached hacky patch out of interest, to see
> what that lock is costing you in your scenario. But if we want consistent
> monotonically-increasing guest time, I suspect we can't get rid of the lock, so
> that's going to limit our scalability unavoidably. Shame.
>
> -- Keir
>
> > Tim.
> >


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


JBeulich at suse

Apr 23, 2012, 12:43 AM

Post #13 of 45 (436 views)
Permalink
Re: lock in vhpet [In reply to]

>>> On 23.04.12 at 09:36, "Zhang, Yang Z" <yang.z.zhang [at] intel> wrote:
> The p2m lock in __get_gfn_type_access() is the culprit. Here is the profiling
> data with 10 seconds:
>
> (XEN) p2m_lock 1 lock:
> (XEN) lock: 190733(00000000:14CE5726), block:
> 67159(00000007:6AAA53F3)
>
> Those data is collected when win8 guest(16 vcpus) is idle. 16 VCPUs blocked
> 30 seconds with 10 sec's profiling. It means 18% of cpu cycle is waiting for
> the p2m lock. And those data only for idle guest. The impaction is more
> seriously when run some workload inside guest.
> I noticed that this change was adding by cs 24770. And before it, we don't
> require the p2m lock in _get_gfn_type_access. So is this lock really
> necessary?

Or shouldn't such a lock frequently taken on a read path be an rwlock
instead?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


yang.z.zhang at intel

Apr 23, 2012, 1:15 AM

Post #14 of 45 (435 views)
Permalink
Re: lock in vhpet [In reply to]

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich [at] suse]
> Sent: Monday, April 23, 2012 3:43 PM
> To: Zhang, Yang Z; Keir Fraser; Tim Deegan
> Cc: andres [at] lagarcavilla; xen-devel [at] lists
> Subject: Re: [Xen-devel] lock in vhpet
>
> >>> On 23.04.12 at 09:36, "Zhang, Yang Z" <yang.z.zhang [at] intel> wrote:
> > The p2m lock in __get_gfn_type_access() is the culprit. Here is the
> > profiling data with 10 seconds:
> >
> > (XEN) p2m_lock 1 lock:
> > (XEN) lock: 190733(00000000:14CE5726), block:
> > 67159(00000007:6AAA53F3)
> >
> > Those data is collected when win8 guest(16 vcpus) is idle. 16 VCPUs
> > blocked
> > 30 seconds with 10 sec's profiling. It means 18% of cpu cycle is
> > waiting for the p2m lock. And those data only for idle guest. The
> > impaction is more seriously when run some workload inside guest.
> > I noticed that this change was adding by cs 24770. And before it, we
> > don't require the p2m lock in _get_gfn_type_access. So is this lock
> > really necessary?
>
> Or shouldn't such a lock frequently taken on a read path be an rwlock instead?
>
Right. Using rwlock would make more sense.

best regards
yang

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


keir.xen at gmail

Apr 23, 2012, 1:22 AM

Post #15 of 45 (436 views)
Permalink
Re: lock in vhpet [In reply to]

On 23/04/2012 09:15, "Zhang, Yang Z" <yang.z.zhang [at] intel> wrote:

>>> Those data is collected when win8 guest(16 vcpus) is idle. 16 VCPUs
>>> blocked
>>> 30 seconds with 10 sec's profiling. It means 18% of cpu cycle is
>>> waiting for the p2m lock. And those data only for idle guest. The
>>> impaction is more seriously when run some workload inside guest.
>>> I noticed that this change was adding by cs 24770. And before it, we
>>> don't require the p2m lock in _get_gfn_type_access. So is this lock
>>> really necessary?
>>
>> Or shouldn't such a lock frequently taken on a read path be an rwlock
>> instead?
>>
> Right. Using rwlock would make more sense.

Interested to see if it would improve performance. My guess would be no.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


tim at xen

Apr 23, 2012, 2:14 AM

Post #16 of 45 (438 views)
Permalink
Re: lock in vhpet [In reply to]

At 07:36 +0000 on 23 Apr (1335166577), Zhang, Yang Z wrote:
> The p2m lock in __get_gfn_type_access() is the culprit. Here is the profiling data with 10 seconds:
>
> (XEN) p2m_lock 1 lock:
> (XEN) lock: 190733(00000000:14CE5726), block: 67159(00000007:6AAA53F3)
>
> Those data is collected when win8 guest(16 vcpus) is idle. 16 VCPUs
> blocked 30 seconds with 10 sec's profiling. It means 18% of cpu cycle
> is waiting for the p2m lock. And those data only for idle guest. The
> impaction is more seriously when run some workload inside guest. I
> noticed that this change was adding by cs 24770. And before it, we
> don't require the p2m lock in _get_gfn_type_access. So is this lock
> really necessary?

Ugh; that certainly is a regression. We used to be lock-free on p2m
lookups and losing that will be bad for perf in lots of ways. IIRC the
original aim was to use fine-grained per-page locks for this -- there
should be no need to hold a per-domain lock during a normal read.
Andres, what happened to that code?

Making it an rwlock would be tricky as this interface doesn't
differenctiate readers from writers. For the common case (no
sharing/paging/mem-access) it ought to be a win since there is hardly
any writing. But it would be better to make this particular lock/unlock
go away.

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


andres at lagarcavilla

Apr 23, 2012, 8:26 AM

Post #17 of 45 (432 views)
Permalink
Re: lock in vhpet [In reply to]

> At 07:36 +0000 on 23 Apr (1335166577), Zhang, Yang Z wrote:
>> The p2m lock in __get_gfn_type_access() is the culprit. Here is the
>> profiling data with 10 seconds:
>>
>> (XEN) p2m_lock 1 lock:
>> (XEN) lock: 190733(00000000:14CE5726), block:
>> 67159(00000007:6AAA53F3)
>>
>> Those data is collected when win8 guest(16 vcpus) is idle. 16 VCPUs
>> blocked 30 seconds with 10 sec's profiling. It means 18% of cpu cycle
>> is waiting for the p2m lock. And those data only for idle guest. The
>> impaction is more seriously when run some workload inside guest. I
>> noticed that this change was adding by cs 24770. And before it, we
>> don't require the p2m lock in _get_gfn_type_access. So is this lock
>> really necessary?
>
> Ugh; that certainly is a regression. We used to be lock-free on p2m
> lookups and losing that will be bad for perf in lots of ways. IIRC the
> original aim was to use fine-grained per-page locks for this -- there
> should be no need to hold a per-domain lock during a normal read.
> Andres, what happened to that code?

The fine-grained p2m locking code is stashed somewhere and untested.
Obviously not meant for 4.2. I don't think it'll be useful here: all vcpus
are hitting the same gfn for the hpet mmio address.

Thanks for the report Yang, it would be great to understand at which call
sites the p2m lock is contended, so we can try to alleviate contention at
the right place.

Looking closely at the code, it would seem the only get_gfn in
hvmemul_do_io is called on ram_gpa == 0 (?!). Shouldn't ram_gpa underlie
the target mmio address for emulation?

Notwithstanding the above, we've optimized p2m access on hvmemul_do_io to
have as brief a critical section as possible: get_gfn, get_page, put_gfn,
work, put_page later.

So contention might arise from (bogusly) doing get_gfn(0) by every vcpu
(when it should instead be get_gfn(hpet_gfn)). The only way to alleviate
that contention is to use get_gfn_query_unlocked, and that will be unsafe
against paging or sharing. I am very skeptical this is causing the
slowdown you see, since you're not using paging or sharing. The critical
section protected by the p2m lock is literally one cmpxchg.

The other source of contention might come from hvmemul_rep_movs, which
holds the p2m lock for the duration of the mmio operation. I can optimize
that one using the get_gfn/get_page/put_gfn pattern mentioned above.

The third source of contention might be the __hvm_copy to/from the target
address of the hpet value read/written. That one can be slightly optimized
by doing the memcpy outside of the scope of the p2m lock.

>
> Making it an rwlock would be tricky as this interface doesn't
> differenctiate readers from writers. For the common case (no
> sharing/paging/mem-access) it ought to be a win since there is hardly
> any writing. But it would be better to make this particular lock/unlock
> go away.
>

We had a discussion about rwlocks way back when. It's impossible (or
nearly so) to tell when an access might upgrade to write privileges.
Deadlock has a high likelihood.

Andres

> Tim.
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


andres at lagarcavilla

Apr 23, 2012, 10:18 AM

Post #18 of 45 (430 views)
Permalink
Re: lock in vhpet [In reply to]

> At 07:36 +0000 on 23 Apr (1335166577), Zhang, Yang Z wrote:
>> The p2m lock in __get_gfn_type_access() is the culprit. Here is the
>> profiling data with 10 seconds:
>>
>> (XEN) p2m_lock 1 lock:
>> (XEN) lock: 190733(00000000:14CE5726), block:
>> 67159(00000007:6AAA53F3)
>>
>> Those data is collected when win8 guest(16 vcpus) is idle. 16 VCPUs
>> blocked 30 seconds with 10 sec's profiling. It means 18% of cpu cycle
>> is waiting for the p2m lock. And those data only for idle guest. The
>> impaction is more seriously when run some workload inside guest. I
>> noticed that this change was adding by cs 24770. And before it, we
>> don't require the p2m lock in _get_gfn_type_access. So is this lock
>> really necessary?
>
> Ugh; that certainly is a regression. We used to be lock-free on p2m
> lookups and losing that will be bad for perf in lots of ways. IIRC the
> original aim was to use fine-grained per-page locks for this -- there
> should be no need to hold a per-domain lock during a normal read.
> Andres, what happened to that code?

Sigh, scratch a lot of nonsense I just spewed on the "hpet gfn". No actual
page backing that one, no concerns.

Still is the case that ram_gpa is zero in many cases going into
hvmemul_do_io. There really is no point in doing a get_page(get_gfn(0)).
How about the following (there's more email after this patch):
# HG changeset patch
# Parent ccc64793187f7d9c926343a1cd4ae822a4364bd1
x86/emulation: No need to get_gfn on zero ram_gpa.

Signed-off-by: Andres Lagar-Cavilla <andres [at] lagarcavilla>

diff -r ccc64793187f xen/arch/x86/hvm/emulate.c
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -60,33 +60,37 @@ static int hvmemul_do_io(
ioreq_t *p = get_ioreq(curr);
unsigned long ram_gfn = paddr_to_pfn(ram_gpa);
p2m_type_t p2mt;
- mfn_t ram_mfn;
+ mfn_t ram_mfn = _mfn(INVALID_MFN);
int rc;

- /* Check for paged out page */
- ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
- if ( p2m_is_paging(p2mt) )
- {
- put_gfn(curr->domain, ram_gfn);
- p2m_mem_paging_populate(curr->domain, ram_gfn);
- return X86EMUL_RETRY;
- }
- if ( p2m_is_shared(p2mt) )
- {
- put_gfn(curr->domain, ram_gfn);
- return X86EMUL_RETRY;
- }
-
- /* Maintain a ref on the mfn to ensure liveness. Put the gfn
- * to avoid potential deadlock wrt event channel lock, later. */
- if ( mfn_valid(mfn_x(ram_mfn)) )
- if ( !get_page(mfn_to_page(mfn_x(ram_mfn)),
- curr->domain) )
+ /* Many callers pass a stub zero ram_gpa addresses. */
+ if ( ram_gfn != 0 )
+ {
+ /* Check for paged out page */
+ ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
+ if ( p2m_is_paging(p2mt) )
{
- put_gfn(curr->domain, ram_gfn);
+ put_gfn(curr->domain, ram_gfn);
+ p2m_mem_paging_populate(curr->domain, ram_gfn);
return X86EMUL_RETRY;
}
- put_gfn(curr->domain, ram_gfn);
+ if ( p2m_is_shared(p2mt) )
+ {
+ put_gfn(curr->domain, ram_gfn);
+ return X86EMUL_RETRY;
+ }
+
+ /* Maintain a ref on the mfn to ensure liveness. Put the gfn
+ * to avoid potential deadlock wrt event channel lock, later. */
+ if ( mfn_valid(mfn_x(ram_mfn)) )
+ if ( !get_page(mfn_to_page(mfn_x(ram_mfn)),
+ curr->domain) )
+ {
+ put_gfn(curr->domain, ram_gfn);
+ return X86EMUL_RETRY;
+ }
+ put_gfn(curr->domain, ram_gfn);
+ }

/*
* Weird-sized accesses have undefined behaviour: we discard writes


If contention is coming in from the emul_rep_movs path, then that might be
taken care of with the following:
# HG changeset patch
# Parent 18b694840961cb6e3934628b23902a7979f00bac
x86/emulate: Relieve contention of p2m lock in emulation of rep movs.

get_two_gfns is used to query the source and target physical addresses of the
emulated rep movs. It is not necessary to hold onto the two gfn's for the
duration of the emulation: further calls down the line will do the
appropriate
unsahring or paging in, and unwind correctly on failure.

Signed-off-by: Andres Lagar-Cavilla <andres [at] lagarcavilla>

diff -r 18b694840961 xen/arch/x86/hvm/emulate.c
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -714,25 +714,23 @@ static int hvmemul_rep_movs(
if ( rc != X86EMUL_OKAY )
return rc;

+ /* The query on the gfns is to establish the need for mmio. In the
two mmio
+ * cases, a proper get_gfn for the "other" gfn will be enacted, with
paging in
+ * or unsharing if necessary. In the memmove case, the gfn might
change given
+ * the bytes mov'ed, and, again, proper get_gfn's will be enacted in
+ * __hvm_copy. */
get_two_gfns(current->domain, sgpa >> PAGE_SHIFT, &sp2mt, NULL, NULL,
current->domain, dgpa >> PAGE_SHIFT, &dp2mt, NULL, NULL,
P2M_ALLOC, &tg);
-
+ put_two_gfns(&tg);
+
if ( !p2m_is_ram(sp2mt) && !p2m_is_grant(sp2mt) )
- {
- rc = hvmemul_do_mmio(
+ return hvmemul_do_mmio(
sgpa, reps, bytes_per_rep, dgpa, IOREQ_READ, df, NULL);
- put_two_gfns(&tg);
- return rc;
- }

if ( !p2m_is_ram(dp2mt) && !p2m_is_grant(dp2mt) )
- {
- rc = hvmemul_do_mmio(
+ return hvmemul_do_mmio(
dgpa, reps, bytes_per_rep, sgpa, IOREQ_WRITE, df, NULL);
- put_two_gfns(&tg);
- return rc;
- }

/* RAM-to-RAM copy: emulate as equivalent of memmove(dgpa, sgpa,
bytes). */
bytes = *reps * bytes_per_rep;
@@ -747,10 +745,7 @@ static int hvmemul_rep_movs(
* can be emulated by a source-to-buffer-to-destination block copy.
*/
if ( ((dgpa + bytes_per_rep) > sgpa) && (dgpa < (sgpa + bytes)) )
- {
- put_two_gfns(&tg);
return X86EMUL_UNHANDLEABLE;
- }

/* Adjust destination address for reverse copy. */
if ( df )
@@ -759,10 +754,7 @@ static int hvmemul_rep_movs(
/* Allocate temporary buffer. Fall back to slow emulation if this
fails. */
buf = xmalloc_bytes(bytes);
if ( buf == NULL )
- {
- put_two_gfns(&tg);
return X86EMUL_UNHANDLEABLE;
- }

/*
* We do a modicum of checking here, just for paranoia's sake and to
@@ -773,7 +765,6 @@ static int hvmemul_rep_movs(
rc = hvm_copy_to_guest_phys(dgpa, buf, bytes);

xfree(buf);
- put_two_gfns(&tg);

if ( rc == HVMCOPY_gfn_paged_out )
return X86EMUL_RETRY;


Let me know if any of this helps
Thanks,
Andres

>
> Making it an rwlock would be tricky as this interface doesn't
> differenctiate readers from writers. For the common case (no
> sharing/paging/mem-access) it ought to be a win since there is hardly
> any writing. But it would be better to make this particular lock/unlock
> go away.
>
> Tim.
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


yang.z.zhang at intel

Apr 24, 2012, 1:58 AM

Post #19 of 45 (430 views)
Permalink
Re: lock in vhpet [In reply to]

> -----Original Message-----
> From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
> Sent: Tuesday, April 24, 2012 1:19 AM
>
> Let me know if any of this helps
No, it not works.

best regards
yang

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


tim at xen

Apr 24, 2012, 2:15 AM

Post #20 of 45 (428 views)
Permalink
Re: lock in vhpet [In reply to]

At 08:26 -0700 on 23 Apr (1335169568), Andres Lagar-Cavilla wrote:
> > At 07:36 +0000 on 23 Apr (1335166577), Zhang, Yang Z wrote:
> >> The p2m lock in __get_gfn_type_access() is the culprit. Here is the
> >> profiling data with 10 seconds:
> >>
> >> (XEN) p2m_lock 1 lock:
> >> (XEN) lock: 190733(00000000:14CE5726), block:
> >> 67159(00000007:6AAA53F3)
> >>
> >> Those data is collected when win8 guest(16 vcpus) is idle. 16 VCPUs
> >> blocked 30 seconds with 10 sec's profiling. It means 18% of cpu cycle
> >> is waiting for the p2m lock. And those data only for idle guest. The
> >> impaction is more seriously when run some workload inside guest. I
> >> noticed that this change was adding by cs 24770. And before it, we
> >> don't require the p2m lock in _get_gfn_type_access. So is this lock
> >> really necessary?
> >
> > Ugh; that certainly is a regression. We used to be lock-free on p2m
> > lookups and losing that will be bad for perf in lots of ways. IIRC the
> > original aim was to use fine-grained per-page locks for this -- there
> > should be no need to hold a per-domain lock during a normal read.
> > Andres, what happened to that code?
>
> The fine-grained p2m locking code is stashed somewhere and untested.
> Obviously not meant for 4.2. I don't think it'll be useful here: all vcpus
> are hitting the same gfn for the hpet mmio address.

We'll have to do _something_ for 4.2 if it's introducing an 18% CPU
overhead in an otherwise idle VM.

In any case I think this means I probably shouldn't take the patch that
turns on this locking for shadow VMs. They do a lot more p2m lookups.

> The other source of contention might come from hvmemul_rep_movs, which
> holds the p2m lock for the duration of the mmio operation. I can optimize
> that one using the get_gfn/get_page/put_gfn pattern mentioned above.

But wouldn't that be unsafe? What if the p2m changes during the
operation? Or, conversely, could you replace all uses of the lock in
p2m lookups with get_page() on the result and still get what you need?

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


tim at xen

Apr 24, 2012, 2:16 AM

Post #21 of 45 (431 views)
Permalink
Re: lock in vhpet [In reply to]

At 08:58 +0000 on 24 Apr (1335257909), Zhang, Yang Z wrote:
> > -----Original Message-----
> > From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
> > Sent: Tuesday, April 24, 2012 1:19 AM
> >
> > Let me know if any of this helps
> No, it not works.

Do you mean that it doesn't help with the CPU overhead, or that it's
broken in some other way?

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


andres at lagarcavilla

Apr 24, 2012, 6:28 AM

Post #22 of 45 (421 views)
Permalink
Re: lock in vhpet [In reply to]

> At 08:26 -0700 on 23 Apr (1335169568), Andres Lagar-Cavilla wrote:
>> > At 07:36 +0000 on 23 Apr (1335166577), Zhang, Yang Z wrote:
>> >> The p2m lock in __get_gfn_type_access() is the culprit. Here is the
>> >> profiling data with 10 seconds:
>> >>
>> >> (XEN) p2m_lock 1 lock:
>> >> (XEN) lock: 190733(00000000:14CE5726), block:
>> >> 67159(00000007:6AAA53F3)
>> >>
>> >> Those data is collected when win8 guest(16 vcpus) is idle. 16 VCPUs
>> >> blocked 30 seconds with 10 sec's profiling. It means 18% of cpu cycle
>> >> is waiting for the p2m lock. And those data only for idle guest. The
>> >> impaction is more seriously when run some workload inside guest. I
>> >> noticed that this change was adding by cs 24770. And before it, we
>> >> don't require the p2m lock in _get_gfn_type_access. So is this lock
>> >> really necessary?
>> >
>> > Ugh; that certainly is a regression. We used to be lock-free on p2m
>> > lookups and losing that will be bad for perf in lots of ways. IIRC
>> the
>> > original aim was to use fine-grained per-page locks for this -- there
>> > should be no need to hold a per-domain lock during a normal read.
>> > Andres, what happened to that code?
>>
>> The fine-grained p2m locking code is stashed somewhere and untested.
>> Obviously not meant for 4.2. I don't think it'll be useful here: all
>> vcpus
>> are hitting the same gfn for the hpet mmio address.
>
> We'll have to do _something_ for 4.2 if it's introducing an 18% CPU
> overhead in an otherwise idle VM.

An hpet mmio read or write hits get_gfn twice: one for gfn zero at the
beginning of hvmemul_do_io, the other during hvm copy. The patch I sent to
Yang yesterday removes the bogus first get_gfn. The second one has to
perform a locked query.

Yang, is there any possibility to get more information here? The ability
to identify the call site that contends for the p2m lock would be crucial.
Even something as crude as sampling vcpu stack traces by hitting 'd'
repeatedly on the serial line, and seeing what sticks out frequently.

>
> In any case I think this means I probably shouldn't take the patch that
> turns on this locking for shadow VMs. They do a lot more p2m lookups.
>
>> The other source of contention might come from hvmemul_rep_movs, which
>> holds the p2m lock for the duration of the mmio operation. I can
>> optimize
>> that one using the get_gfn/get_page/put_gfn pattern mentioned above.
>
> But wouldn't that be unsafe? What if the p2m changes during the
> operation? Or, conversely, could you replace all uses of the lock in
> p2m lookups with get_page() on the result and still get what you need?

I've been thinking of wrapping the pattern into a handy p2m accessor. I
see this pattern repeating itself for hvm_copy, tss/segment entry, page
table walking, nested hvm, etc, in which the consumer wants to map the
result of the translation. By the time you finish with a get_gfn_unshare,
most possible p2m transformations will have been taken care of (PoD-alloc,
page in, unshare). By taking a reference to the underlying page, paging
out is prevented, and then the vcpu can safely let go of the p2m lock.

Conceivably guest_physmap_remove_page and guest_physmap_add_entry can
still come in and change the p2m entry.

Andres

>
> Tim.
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


yang.z.zhang at intel

Apr 24, 2012, 5:27 PM

Post #23 of 45 (422 views)
Permalink
Re: lock in vhpet [In reply to]

> -----Original Message-----
> From: Tim Deegan [mailto:tim [at] xen]
> Sent: Tuesday, April 24, 2012 5:17 PM
> To: Zhang, Yang Z
> Cc: andres [at] lagarcavilla; xen-devel [at] lists; Keir Fraser
> Subject: Re: [Xen-devel] lock in vhpet
>
> At 08:58 +0000 on 24 Apr (1335257909), Zhang, Yang Z wrote:
> > > -----Original Message-----
> > > From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
> > > Sent: Tuesday, April 24, 2012 1:19 AM
> > >
> > > Let me know if any of this helps
> > No, it not works.
>
> Do you mean that it doesn't help with the CPU overhead, or that it's broken in
> some other way?
>
It cannot help with the CPU overhead

best regards
yang

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


andres at lagarcavilla

Apr 24, 2012, 6:40 PM

Post #24 of 45 (422 views)
Permalink
Re: lock in vhpet [In reply to]

>> -----Original Message-----
>> From: Tim Deegan [mailto:tim [at] xen]
>> Sent: Tuesday, April 24, 2012 5:17 PM
>> To: Zhang, Yang Z
>> Cc: andres [at] lagarcavilla; xen-devel [at] lists; Keir Fraser
>> Subject: Re: [Xen-devel] lock in vhpet
>>
>> At 08:58 +0000 on 24 Apr (1335257909), Zhang, Yang Z wrote:
>> > > -----Original Message-----
>> > > From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
>> > > Sent: Tuesday, April 24, 2012 1:19 AM
>> > >
>> > > Let me know if any of this helps
>> > No, it not works.
>>
>> Do you mean that it doesn't help with the CPU overhead, or that it's
>> broken in
>> some other way?
>>
> It cannot help with the CPU overhead

Yang, is there any further information you can provide? A rough idea of
where vcpus are spending time spinning for the p2m lock would be
tremendously useful.

Thanks
Andres

>
> best regards
> yang
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


yang.z.zhang at intel

Apr 24, 2012, 6:48 PM

Post #25 of 45 (422 views)
Permalink
Re: lock in vhpet [In reply to]

> -----Original Message-----
> From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
> Sent: Wednesday, April 25, 2012 9:40 AM
> To: Zhang, Yang Z
> Cc: Tim Deegan; xen-devel [at] lists; Keir Fraser
> Subject: RE: [Xen-devel] lock in vhpet
>
> >> -----Original Message-----
> >> From: Tim Deegan [mailto:tim [at] xen]
> >> Sent: Tuesday, April 24, 2012 5:17 PM
> >> To: Zhang, Yang Z
> >> Cc: andres [at] lagarcavilla; xen-devel [at] lists; Keir
> >> Fraser
> >> Subject: Re: [Xen-devel] lock in vhpet
> >>
> >> At 08:58 +0000 on 24 Apr (1335257909), Zhang, Yang Z wrote:
> >> > > -----Original Message-----
> >> > > From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
> >> > > Sent: Tuesday, April 24, 2012 1:19 AM
> >> > >
> >> > > Let me know if any of this helps
> >> > No, it not works.
> >>
> >> Do you mean that it doesn't help with the CPU overhead, or that it's
> >> broken in some other way?
> >>
> > It cannot help with the CPU overhead
>
> Yang, is there any further information you can provide? A rough idea of where
> vcpus are spending time spinning for the p2m lock would be tremendously
> useful.
>
I am doing the further investigation. Hope can get more useful information.
But actually, the first cs introduced this issue is 24770. When win8 booting and if hpet is enabled, it will use hpet as the time source and there have lots of hpet access and EPT violation. In EPT violation handler, it call get_gfn_type_access to get the mfn. The cs 24770 introduces the gfn_lock for p2m lookups, and then the issue happens. After I removed the gfn_lock, the issue goes. But in latest xen, even I remove this lock, it still shows high cpu utilization.

yang

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel

First page Previous page 1 2 Next page Last page  View All Xen devel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.