Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Xen: Devel

lock in vhpet

 

 

First page Previous page 1 2 Next page Last page  View All Xen devel RSS feed   Index | Next | Previous | View Threaded


andres at lagarcavilla

Apr 24, 2012, 7:31 PM

Post #26 of 45 (208 views)
Permalink
Re: lock in vhpet [In reply to]

>
>> -----Original Message-----
>> From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
>> Sent: Wednesday, April 25, 2012 9:40 AM
>> To: Zhang, Yang Z
>> Cc: Tim Deegan; xen-devel [at] lists; Keir Fraser
>> Subject: RE: [Xen-devel] lock in vhpet
>>
>> >> -----Original Message-----
>> >> From: Tim Deegan [mailto:tim [at] xen]
>> >> Sent: Tuesday, April 24, 2012 5:17 PM
>> >> To: Zhang, Yang Z
>> >> Cc: andres [at] lagarcavilla; xen-devel [at] lists; Keir
>> >> Fraser
>> >> Subject: Re: [Xen-devel] lock in vhpet
>> >>
>> >> At 08:58 +0000 on 24 Apr (1335257909), Zhang, Yang Z wrote:
>> >> > > -----Original Message-----
>> >> > > From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
>> >> > > Sent: Tuesday, April 24, 2012 1:19 AM
>> >> > >
>> >> > > Let me know if any of this helps
>> >> > No, it not works.
>> >>
>> >> Do you mean that it doesn't help with the CPU overhead, or that it's
>> >> broken in some other way?
>> >>
>> > It cannot help with the CPU overhead
>>
>> Yang, is there any further information you can provide? A rough idea of
>> where
>> vcpus are spending time spinning for the p2m lock would be tremendously
>> useful.
>>
> I am doing the further investigation. Hope can get more useful
> information.

Thanks, looking forward to that.

> But actually, the first cs introduced this issue is 24770. When win8
> booting and if hpet is enabled, it will use hpet as the time source and
> there have lots of hpet access and EPT violation. In EPT violation
> handler, it call get_gfn_type_access to get the mfn. The cs 24770
> introduces the gfn_lock for p2m lookups, and then the issue happens. After
> I removed the gfn_lock, the issue goes. But in latest xen, even I remove
> this lock, it still shows high cpu utilization.
>

It would seem then that even the briefest lock-protected critical section
would cause this? In the mmio case, the p2m lock taken in the hap fault
handler is held during the actual lookup, and for a couple of branch
instructions afterwards.

In latest Xen, with lock removed for get_gfn, on which lock is time spent?

Thanks,
Andres

> yang
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


yang.z.zhang at intel

Apr 24, 2012, 7:36 PM

Post #27 of 45 (210 views)
Permalink
Re: lock in vhpet [In reply to]

> -----Original Message-----
> From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
> Sent: Wednesday, April 25, 2012 10:31 AM
> To: Zhang, Yang Z
> Cc: Tim Deegan; xen-devel [at] lists; Keir Fraser
> Subject: RE: [Xen-devel] lock in vhpet
>
> >
> >> -----Original Message-----
> >> From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
> >> Sent: Wednesday, April 25, 2012 9:40 AM
> >> To: Zhang, Yang Z
> >> Cc: Tim Deegan; xen-devel [at] lists; Keir Fraser
> >> Subject: RE: [Xen-devel] lock in vhpet
> >>
> >> >> -----Original Message-----
> >> >> From: Tim Deegan [mailto:tim [at] xen]
> >> >> Sent: Tuesday, April 24, 2012 5:17 PM
> >> >> To: Zhang, Yang Z
> >> >> Cc: andres [at] lagarcavilla; xen-devel [at] lists; Keir
> >> >> Fraser
> >> >> Subject: Re: [Xen-devel] lock in vhpet
> >> >>
> >> >> At 08:58 +0000 on 24 Apr (1335257909), Zhang, Yang Z wrote:
> >> >> > > -----Original Message-----
> >> >> > > From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
> >> >> > > Sent: Tuesday, April 24, 2012 1:19 AM
> >> >> > >
> >> >> > > Let me know if any of this helps
> >> >> > No, it not works.
> >> >>
> >> >> Do you mean that it doesn't help with the CPU overhead, or that
> >> >> it's broken in some other way?
> >> >>
> >> > It cannot help with the CPU overhead
> >>
> >> Yang, is there any further information you can provide? A rough idea
> >> of where vcpus are spending time spinning for the p2m lock would be
> >> tremendously useful.
> >>
> > I am doing the further investigation. Hope can get more useful
> > information.
>
> Thanks, looking forward to that.
>
> > But actually, the first cs introduced this issue is 24770. When win8
> > booting and if hpet is enabled, it will use hpet as the time source
> > and there have lots of hpet access and EPT violation. In EPT violation
> > handler, it call get_gfn_type_access to get the mfn. The cs 24770
> > introduces the gfn_lock for p2m lookups, and then the issue happens.
> > After I removed the gfn_lock, the issue goes. But in latest xen, even
> > I remove this lock, it still shows high cpu utilization.
> >
>
> It would seem then that even the briefest lock-protected critical section would
> cause this? In the mmio case, the p2m lock taken in the hap fault handler is
> held during the actual lookup, and for a couple of branch instructions
> afterwards.
>
> In latest Xen, with lock removed for get_gfn, on which lock is time spent?
Still the p2m_lock.

yang


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


andres at lagarcavilla

Apr 24, 2012, 7:42 PM

Post #28 of 45 (209 views)
Permalink
Re: lock in vhpet [In reply to]

>> -----Original Message-----
>> From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
>> Sent: Wednesday, April 25, 2012 10:31 AM
>> To: Zhang, Yang Z
>> Cc: Tim Deegan; xen-devel [at] lists; Keir Fraser
>> Subject: RE: [Xen-devel] lock in vhpet
>>
>> >
>> >> -----Original Message-----
>> >> From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
>> >> Sent: Wednesday, April 25, 2012 9:40 AM
>> >> To: Zhang, Yang Z
>> >> Cc: Tim Deegan; xen-devel [at] lists; Keir Fraser
>> >> Subject: RE: [Xen-devel] lock in vhpet
>> >>
>> >> >> -----Original Message-----
>> >> >> From: Tim Deegan [mailto:tim [at] xen]
>> >> >> Sent: Tuesday, April 24, 2012 5:17 PM
>> >> >> To: Zhang, Yang Z
>> >> >> Cc: andres [at] lagarcavilla; xen-devel [at] lists; Keir
>> >> >> Fraser
>> >> >> Subject: Re: [Xen-devel] lock in vhpet
>> >> >>
>> >> >> At 08:58 +0000 on 24 Apr (1335257909), Zhang, Yang Z wrote:
>> >> >> > > -----Original Message-----
>> >> >> > > From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
>> >> >> > > Sent: Tuesday, April 24, 2012 1:19 AM
>> >> >> > >
>> >> >> > > Let me know if any of this helps
>> >> >> > No, it not works.
>> >> >>
>> >> >> Do you mean that it doesn't help with the CPU overhead, or that
>> >> >> it's broken in some other way?
>> >> >>
>> >> > It cannot help with the CPU overhead
>> >>
>> >> Yang, is there any further information you can provide? A rough idea
>> >> of where vcpus are spending time spinning for the p2m lock would be
>> >> tremendously useful.
>> >>
>> > I am doing the further investigation. Hope can get more useful
>> > information.
>>
>> Thanks, looking forward to that.
>>
>> > But actually, the first cs introduced this issue is 24770. When win8
>> > booting and if hpet is enabled, it will use hpet as the time source
>> > and there have lots of hpet access and EPT violation. In EPT violation
>> > handler, it call get_gfn_type_access to get the mfn. The cs 24770
>> > introduces the gfn_lock for p2m lookups, and then the issue happens.
>> > After I removed the gfn_lock, the issue goes. But in latest xen, even
>> > I remove this lock, it still shows high cpu utilization.
>> >
>>
>> It would seem then that even the briefest lock-protected critical
>> section would
>> cause this? In the mmio case, the p2m lock taken in the hap fault
>> handler is
>> held during the actual lookup, and for a couple of branch instructions
>> afterwards.
>>
>> In latest Xen, with lock removed for get_gfn, on which lock is time
>> spent?
> Still the p2m_lock.

How are you removing the lock from get_gfn?

The p2m lock is taken on a few specific code paths outside of get_gfn
(change type of an entry, add a new p2m entry, setup and teardown), and
I'm surprised any of those code paths is being used by the hpet mmio
handler.

Andres

>
> yang
>
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


yang.z.zhang at intel

Apr 24, 2012, 8:12 PM

Post #29 of 45 (207 views)
Permalink
Re: lock in vhpet [In reply to]

> -----Original Message-----
> From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
> Sent: Wednesday, April 25, 2012 10:42 AM
> To: Zhang, Yang Z
> Cc: Tim Deegan; xen-devel [at] lists; Keir Fraser
> Subject: RE: [Xen-devel] lock in vhpet
>
> >> -----Original Message-----
> >> From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
> >> Sent: Wednesday, April 25, 2012 10:31 AM
> >> To: Zhang, Yang Z
> >> Cc: Tim Deegan; xen-devel [at] lists; Keir Fraser
> >> Subject: RE: [Xen-devel] lock in vhpet
> >>
> >> >
> >> >> -----Original Message-----
> >> >> From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
> >> >> Sent: Wednesday, April 25, 2012 9:40 AM
> >> >> To: Zhang, Yang Z
> >> >> Cc: Tim Deegan; xen-devel [at] lists; Keir Fraser
> >> >> Subject: RE: [Xen-devel] lock in vhpet
> >> >>
> >> >> >> -----Original Message-----
> >> >> >> From: Tim Deegan [mailto:tim [at] xen]
> >> >> >> Sent: Tuesday, April 24, 2012 5:17 PM
> >> >> >> To: Zhang, Yang Z
> >> >> >> Cc: andres [at] lagarcavilla; xen-devel [at] lists;
> >> >> >> Keir Fraser
> >> >> >> Subject: Re: [Xen-devel] lock in vhpet
> >> >> >>
> >> >> >> At 08:58 +0000 on 24 Apr (1335257909), Zhang, Yang Z wrote:
> >> >> >> > > -----Original Message-----
> >> >> >> > > From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
> >> >> >> > > Sent: Tuesday, April 24, 2012 1:19 AM
> >> >> >> > >
> >> >> >> > > Let me know if any of this helps
> >> >> >> > No, it not works.
> >> >> >>
> >> >> >> Do you mean that it doesn't help with the CPU overhead, or that
> >> >> >> it's broken in some other way?
> >> >> >>
> >> >> > It cannot help with the CPU overhead
> >> >>
> >> >> Yang, is there any further information you can provide? A rough
> >> >> idea of where vcpus are spending time spinning for the p2m lock
> >> >> would be tremendously useful.
> >> >>
> >> > I am doing the further investigation. Hope can get more useful
> >> > information.
> >>
> >> Thanks, looking forward to that.
> >>
> >> > But actually, the first cs introduced this issue is 24770. When
> >> > win8 booting and if hpet is enabled, it will use hpet as the time
> >> > source and there have lots of hpet access and EPT violation. In EPT
> >> > violation handler, it call get_gfn_type_access to get the mfn. The
> >> > cs 24770 introduces the gfn_lock for p2m lookups, and then the issue
> happens.
> >> > After I removed the gfn_lock, the issue goes. But in latest xen,
> >> > even I remove this lock, it still shows high cpu utilization.
> >> >
> >>
> >> It would seem then that even the briefest lock-protected critical
> >> section would cause this? In the mmio case, the p2m lock taken in the
> >> hap fault handler is held during the actual lookup, and for a couple
> >> of branch instructions afterwards.
> >>
> >> In latest Xen, with lock removed for get_gfn, on which lock is time
> >> spent?
> > Still the p2m_lock.
>
> How are you removing the lock from get_gfn?
>
> The p2m lock is taken on a few specific code paths outside of get_gfn (change
> type of an entry, add a new p2m entry, setup and teardown), and I'm surprised
> any of those code paths is being used by the hpet mmio handler.

Sorry, what I said maybe not accurate. In latest xen, I use a workaround way to skip calling get_gfn_type_access in hvm_hap_nested_page_fault(). So the p2m_lock is still existing.
Now, I found the contention of p2m_lock is coming from __hvm_copy. In mmio handler, it has some code paths to call it(hvm_fetch_from_guest_virt_nofault(), hvm_copy_from_guest_virt()). When lots of mmio access happened, the contention is very obviously.

yang

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


andres at lagarcavilla

Apr 24, 2012, 8:34 PM

Post #30 of 45 (210 views)
Permalink
Re: lock in vhpet [In reply to]

>> -----Original Message-----
>> From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
>> Sent: Wednesday, April 25, 2012 10:42 AM
>> To: Zhang, Yang Z
>> Cc: Tim Deegan; xen-devel [at] lists; Keir Fraser
>> Subject: RE: [Xen-devel] lock in vhpet
>>
>> >> -----Original Message-----
>> >> From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
>> >> Sent: Wednesday, April 25, 2012 10:31 AM
>> >> To: Zhang, Yang Z
>> >> Cc: Tim Deegan; xen-devel [at] lists; Keir Fraser
>> >> Subject: RE: [Xen-devel] lock in vhpet
>> >>
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
>> >> >> Sent: Wednesday, April 25, 2012 9:40 AM
>> >> >> To: Zhang, Yang Z
>> >> >> Cc: Tim Deegan; xen-devel [at] lists; Keir Fraser
>> >> >> Subject: RE: [Xen-devel] lock in vhpet
>> >> >>
>> >> >> >> -----Original Message-----
>> >> >> >> From: Tim Deegan [mailto:tim [at] xen]
>> >> >> >> Sent: Tuesday, April 24, 2012 5:17 PM
>> >> >> >> To: Zhang, Yang Z
>> >> >> >> Cc: andres [at] lagarcavilla; xen-devel [at] lists;
>> >> >> >> Keir Fraser
>> >> >> >> Subject: Re: [Xen-devel] lock in vhpet
>> >> >> >>
>> >> >> >> At 08:58 +0000 on 24 Apr (1335257909), Zhang, Yang Z wrote:
>> >> >> >> > > -----Original Message-----
>> >> >> >> > > From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
>> >> >> >> > > Sent: Tuesday, April 24, 2012 1:19 AM
>> >> >> >> > >
>> >> >> >> > > Let me know if any of this helps
>> >> >> >> > No, it not works.
>> >> >> >>
>> >> >> >> Do you mean that it doesn't help with the CPU overhead, or that
>> >> >> >> it's broken in some other way?
>> >> >> >>
>> >> >> > It cannot help with the CPU overhead
>> >> >>
>> >> >> Yang, is there any further information you can provide? A rough
>> >> >> idea of where vcpus are spending time spinning for the p2m lock
>> >> >> would be tremendously useful.
>> >> >>
>> >> > I am doing the further investigation. Hope can get more useful
>> >> > information.
>> >>
>> >> Thanks, looking forward to that.
>> >>
>> >> > But actually, the first cs introduced this issue is 24770. When
>> >> > win8 booting and if hpet is enabled, it will use hpet as the time
>> >> > source and there have lots of hpet access and EPT violation. In EPT
>> >> > violation handler, it call get_gfn_type_access to get the mfn. The
>> >> > cs 24770 introduces the gfn_lock for p2m lookups, and then the
>> issue
>> happens.
>> >> > After I removed the gfn_lock, the issue goes. But in latest xen,
>> >> > even I remove this lock, it still shows high cpu utilization.
>> >> >
>> >>
>> >> It would seem then that even the briefest lock-protected critical
>> >> section would cause this? In the mmio case, the p2m lock taken in the
>> >> hap fault handler is held during the actual lookup, and for a couple
>> >> of branch instructions afterwards.
>> >>
>> >> In latest Xen, with lock removed for get_gfn, on which lock is time
>> >> spent?
>> > Still the p2m_lock.
>>
>> How are you removing the lock from get_gfn?
>>
>> The p2m lock is taken on a few specific code paths outside of get_gfn
>> (change
>> type of an entry, add a new p2m entry, setup and teardown), and I'm
>> surprised
>> any of those code paths is being used by the hpet mmio handler.
>
> Sorry, what I said maybe not accurate. In latest xen, I use a workaround
> way to skip calling get_gfn_type_access in hvm_hap_nested_page_fault(). So
> the p2m_lock is still existing.
> Now, I found the contention of p2m_lock is coming from __hvm_copy. In mmio
> handler, it has some code paths to call
> it(hvm_fetch_from_guest_virt_nofault(), hvm_copy_from_guest_virt()). When
> lots of mmio access happened, the contention is very obviously.

Thanks. Can you please try this:
http://lists.xen.org/archives/html/xen-devel/2012-04/msg01861.html

in conjunction with the patch below?
Andres

diff -r 7a7443e80b99 xen/arch/x86/hvm/hvm.c
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2383,6 +2383,8 @@ static enum hvm_copy_result __hvm_copy(

while ( todo > 0 )
{
+ struct page_info *pg;
+
count = min_t(int, PAGE_SIZE - (addr & ~PAGE_MASK), todo);

if ( flags & HVMCOPY_virt )
@@ -2427,7 +2429,11 @@ static enum hvm_copy_result __hvm_copy(
put_gfn(curr->domain, gfn);
return HVMCOPY_bad_gfn_to_mfn;
}
+
ASSERT(mfn_valid(mfn));
+ pg = mfn_to_page(mfn);
+ ASSERT(get_page(pg, curr->domain));
+ put_gfn(curr->domain, gfn);

p = (char *)map_domain_page(mfn) + (addr & ~PAGE_MASK);

@@ -2457,7 +2463,7 @@ static enum hvm_copy_result __hvm_copy(
addr += count;
buf += count;
todo -= count;
- put_gfn(curr->domain, gfn);
+ put_page(pg);
}

return HVMCOPY_okay;

>
> yang
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


yang.z.zhang at intel

Apr 24, 2012, 10:18 PM

Post #31 of 45 (211 views)
Permalink
Re: lock in vhpet [In reply to]

> -----Original Message-----
> From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
> Sent: Wednesday, April 25, 2012 11:34 AM
>
> Thanks. Can you please try this:
> http://lists.xen.org/archives/html/xen-devel/2012-04/msg01861.html
>
> in conjunction with the patch below?
> Andres
>
> diff -r 7a7443e80b99 xen/arch/x86/hvm/hvm.c
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2383,6 +2383,8 @@ static enum hvm_copy_result __hvm_copy(
>
> while ( todo > 0 )
> {
> + struct page_info *pg;
> +
> count = min_t(int, PAGE_SIZE - (addr & ~PAGE_MASK), todo);
>
> if ( flags & HVMCOPY_virt )
> @@ -2427,7 +2429,11 @@ static enum hvm_copy_result __hvm_copy(
> put_gfn(curr->domain, gfn);
> return HVMCOPY_bad_gfn_to_mfn;
> }
> +
> ASSERT(mfn_valid(mfn));
> + pg = mfn_to_page(mfn);
> + ASSERT(get_page(pg, curr->domain));
> + put_gfn(curr->domain, gfn);
>
> p = (char *)map_domain_page(mfn) + (addr & ~PAGE_MASK);
>
> @@ -2457,7 +2463,7 @@ static enum hvm_copy_result __hvm_copy(
> addr += count;
> buf += count;
> todo -= count;
> - put_gfn(curr->domain, gfn);
> + put_page(pg);
> }
>
> return HVMCOPY_okay;
No, it doesn't work. On the contrary, the competition is more fiercer than before:
Here is the p2m_lock competition with 10 seconds with 16 vcpus:
lock: 560583(00000000:83362735), block: 321453(00000009:364CA49B)

yang


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


JBeulich at suse

Apr 25, 2012, 1:07 AM

Post #32 of 45 (209 views)
Permalink
Re: lock in vhpet [In reply to]

>>> On 25.04.12 at 05:34, "Andres Lagar-Cavilla" <andres [at] lagarcavilla> wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2383,6 +2383,8 @@ static enum hvm_copy_result __hvm_copy(
>
> while ( todo > 0 )
> {
> + struct page_info *pg;
> +
> count = min_t(int, PAGE_SIZE - (addr & ~PAGE_MASK), todo);
>
> if ( flags & HVMCOPY_virt )
> @@ -2427,7 +2429,11 @@ static enum hvm_copy_result __hvm_copy(
> put_gfn(curr->domain, gfn);
> return HVMCOPY_bad_gfn_to_mfn;
> }
> +
> ASSERT(mfn_valid(mfn));
> + pg = mfn_to_page(mfn);
> + ASSERT(get_page(pg, curr->domain));

You really shouldn't ever put expressions with (side) effects inside
an ASSERT(), not even for debugging patches that you want others
to apply (you're of course free to shoot yourself in the foot ;-) ), as
it just won't work in non-debug builds.

Jan

> + put_gfn(curr->domain, gfn);
>
> p = (char *)map_domain_page(mfn) + (addr & ~PAGE_MASK);
>
> @@ -2457,7 +2463,7 @@ static enum hvm_copy_result __hvm_copy(
> addr += count;
> buf += count;
> todo -= count;
> - put_gfn(curr->domain, gfn);
> + put_page(pg);
> }
>
> return HVMCOPY_okay;



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


tim at xen

Apr 26, 2012, 2:25 PM

Post #33 of 45 (210 views)
Permalink
Re: lock in vhpet [In reply to]

At 02:36 +0000 on 25 Apr (1335321409), Zhang, Yang Z wrote:
> > > But actually, the first cs introduced this issue is 24770. When win8
> > > booting and if hpet is enabled, it will use hpet as the time source
> > > and there have lots of hpet access and EPT violation. In EPT violation
> > > handler, it call get_gfn_type_access to get the mfn. The cs 24770
> > > introduces the gfn_lock for p2m lookups, and then the issue happens.
> > > After I removed the gfn_lock, the issue goes. But in latest xen, even
> > > I remove this lock, it still shows high cpu utilization.
> >
> > It would seem then that even the briefest lock-protected critical section would
> > cause this? In the mmio case, the p2m lock taken in the hap fault handler is
> > held during the actual lookup, and for a couple of branch instructions
> > afterwards.
> >
> > In latest Xen, with lock removed for get_gfn, on which lock is time spent?
> Still the p2m_lock.

Can you please try the attached patch? I think you'll need this one
plus the ones that take the locks out of the hpet code.

This patch makes the p2m lock into an rwlock and adjusts a number of the
paths that don't update the p2m so they only take the read lock. It's a
bit rough but I can boot 16-way win7 guest with it.

N.B. Since rwlocks don't show up the the existing lock profiling, please
don't try to use the lock-profiling numbers to see if it's helping!

Andres, this is basically the big-hammer version of your "take a
pagecount" changes, plus the change you made to hvmemul_rep_movs().
If this works I intend to follow it up with a patch to make some of the
read-modify-write paths avoid taking the lock (by using a
compare-exchange operation so they only take the lock on a write). If
that succeeds I might drop put_gfn() altogether.

But first it will need a lot of tidying up. Noticeably missing:
- SVM code equivalents to the vmx.c changes
- grant-table operations still use the lock, because frankly I
could not follow the current code, and it's quite late in the evening.
I also have a long list of uglinesses in the mm code that I found while
writing this lot.

Keir, I have no objection to later replacing this with something better
than an rwlock. :) Or with making a NUMA-friendly rwlock
implementation, since I really expect this to be heavily read-mostly
when paging/sharing/pod are not enabled.

Cheers,

Tim.
Attachments: get-page-from-gfn (74.2 KB)


yang.z.zhang at intel

Apr 26, 2012, 5:46 PM

Post #34 of 45 (209 views)
Permalink
Re: lock in vhpet [In reply to]

> -----Original Message-----
> From: Tim Deegan [mailto:tim [at] xen]
> Sent: Friday, April 27, 2012 5:26 AM
> To: Zhang, Yang Z
> Cc: andres [at] lagarcavilla; Keir Fraser; xen-devel [at] lists
> Subject: Re: [Xen-devel] lock in vhpet
>
> At 02:36 +0000 on 25 Apr (1335321409), Zhang, Yang Z wrote:
> > > > But actually, the first cs introduced this issue is 24770. When
> > > > win8 booting and if hpet is enabled, it will use hpet as the time
> > > > source and there have lots of hpet access and EPT violation. In
> > > > EPT violation handler, it call get_gfn_type_access to get the mfn.
> > > > The cs 24770 introduces the gfn_lock for p2m lookups, and then the issue
> happens.
> > > > After I removed the gfn_lock, the issue goes. But in latest xen,
> > > > even I remove this lock, it still shows high cpu utilization.
> > >
> > > It would seem then that even the briefest lock-protected critical
> > > section would cause this? In the mmio case, the p2m lock taken in
> > > the hap fault handler is held during the actual lookup, and for a
> > > couple of branch instructions afterwards.
> > >
> > > In latest Xen, with lock removed for get_gfn, on which lock is time spent?
> > Still the p2m_lock.
>
> Can you please try the attached patch? I think you'll need this one plus the
> ones that take the locks out of the hpet code.
>
> This patch makes the p2m lock into an rwlock and adjusts a number of the
> paths that don't update the p2m so they only take the read lock. It's a bit
> rough but I can boot 16-way win7 guest with it.

This really a great work! Now, the win7 guest is booting very fast and never saw the BSOD again.
But the changes are so large in your patch. I think we need to do more sanity testing to avoid any regressions. After you finish all the work, I'd like to do a whole testing.:)

> N.B. Since rwlocks don't show up the the existing lock profiling, please don't try
> to use the lock-profiling numbers to see if it's helping!
>
> Andres, this is basically the big-hammer version of your "take a pagecount"
> changes, plus the change you made to hvmemul_rep_movs().
> If this works I intend to follow it up with a patch to make some of the
> read-modify-write paths avoid taking the lock (by using a compare-exchange
> operation so they only take the lock on a write). If that succeeds I might drop
> put_gfn() altogether.
>
> But first it will need a lot of tidying up. Noticeably missing:
> - SVM code equivalents to the vmx.c changes
> - grant-table operations still use the lock, because frankly I
> could not follow the current code, and it's quite late in the evening.
> I also have a long list of uglinesses in the mm code that I found while writing
> this lot.
>
> Keir, I have no objection to later replacing this with something better than an
> rwlock. :) Or with making a NUMA-friendly rwlock implementation, since I
> really expect this to be heavily read-mostly when paging/sharing/pod are not
> enabled.
>
> Cheers,
>
> Tim.

best regards
yang

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


andres at lagarcavilla

Apr 26, 2012, 5:51 PM

Post #35 of 45 (208 views)
Permalink
Re: lock in vhpet [In reply to]

>> -----Original Message-----
>> From: Tim Deegan [mailto:tim [at] xen]
>> Sent: Friday, April 27, 2012 5:26 AM
>> To: Zhang, Yang Z
>> Cc: andres [at] lagarcavilla; Keir Fraser; xen-devel [at] lists
>> Subject: Re: [Xen-devel] lock in vhpet
>>
>> At 02:36 +0000 on 25 Apr (1335321409), Zhang, Yang Z wrote:
>> > > > But actually, the first cs introduced this issue is 24770. When
>> > > > win8 booting and if hpet is enabled, it will use hpet as the time
>> > > > source and there have lots of hpet access and EPT violation. In
>> > > > EPT violation handler, it call get_gfn_type_access to get the mfn.
>> > > > The cs 24770 introduces the gfn_lock for p2m lookups, and then the
>> issue
>> happens.
>> > > > After I removed the gfn_lock, the issue goes. But in latest xen,
>> > > > even I remove this lock, it still shows high cpu utilization.
>> > >
>> > > It would seem then that even the briefest lock-protected critical
>> > > section would cause this? In the mmio case, the p2m lock taken in
>> > > the hap fault handler is held during the actual lookup, and for a
>> > > couple of branch instructions afterwards.
>> > >
>> > > In latest Xen, with lock removed for get_gfn, on which lock is time
>> spent?
>> > Still the p2m_lock.
>>
>> Can you please try the attached patch? I think you'll need this one
>> plus the
>> ones that take the locks out of the hpet code.
>>
>> This patch makes the p2m lock into an rwlock and adjusts a number of the
>> paths that don't update the p2m so they only take the read lock. It's a
>> bit
>> rough but I can boot 16-way win7 guest with it.

That is great news.

Tim, thanks for the amazing work. I'm poring over the patch presently, and
will shoot at it with everything I've got testing-wise.

I'm taking the liberty of pulling in Olaf (paging) and George (PoD) as the
changeset affects those subsystems.

Andres

>
> This really a great work! Now, the win7 guest is booting very fast and
> never saw the BSOD again.
> But the changes are so large in your patch. I think we need to do more
> sanity testing to avoid any regressions. After you finish all the work,
> I'd like to do a whole testing.:)
>
>> N.B. Since rwlocks don't show up the the existing lock profiling, please
>> don't try
>> to use the lock-profiling numbers to see if it's helping!
>>
>> Andres, this is basically the big-hammer version of your "take a
>> pagecount"
>> changes, plus the change you made to hvmemul_rep_movs().
>> If this works I intend to follow it up with a patch to make some of the
>> read-modify-write paths avoid taking the lock (by using a
>> compare-exchange
>> operation so they only take the lock on a write). If that succeeds I
>> might drop
>> put_gfn() altogether.
>>
>> But first it will need a lot of tidying up. Noticeably missing:
>> - SVM code equivalents to the vmx.c changes
>> - grant-table operations still use the lock, because frankly I
>> could not follow the current code, and it's quite late in the
>> evening.
>> I also have a long list of uglinesses in the mm code that I found while
>> writing
>> this lot.
>>
>> Keir, I have no objection to later replacing this with something better
>> than an
>> rwlock. :) Or with making a NUMA-friendly rwlock implementation, since
>> I
>> really expect this to be heavily read-mostly when paging/sharing/pod are
>> not
>> enabled.
>>
>> Cheers,
>>
>> Tim.
>
> best regards
> yang
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


yang.z.zhang at intel

Apr 26, 2012, 6:24 PM

Post #36 of 45 (213 views)
Permalink
Re: lock in vhpet [In reply to]

> -----Original Message-----
> From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
> Sent: Friday, April 27, 2012 8:52 AM
> To: Zhang, Yang Z
> Cc: Tim Deegan; Keir Fraser; xen-devel [at] lists; olaf [at] aepfle;
> George.Dunlap [at] eu
> Subject: RE: [Xen-devel] lock in vhpet
>
> >> -----Original Message-----
> >> From: Tim Deegan [mailto:tim [at] xen]
> >> Sent: Friday, April 27, 2012 5:26 AM
> >> To: Zhang, Yang Z
> >> Cc: andres [at] lagarcavilla; Keir Fraser;
> >> xen-devel [at] lists
> >> Subject: Re: [Xen-devel] lock in vhpet
> >>
> >> At 02:36 +0000 on 25 Apr (1335321409), Zhang, Yang Z wrote:
> >> > > > But actually, the first cs introduced this issue is 24770. When
> >> > > > win8 booting and if hpet is enabled, it will use hpet as the
> >> > > > time source and there have lots of hpet access and EPT
> >> > > > violation. In EPT violation handler, it call get_gfn_type_access to get
> the mfn.
> >> > > > The cs 24770 introduces the gfn_lock for p2m lookups, and then
> >> > > > the
> >> issue
> >> happens.
> >> > > > After I removed the gfn_lock, the issue goes. But in latest
> >> > > > xen, even I remove this lock, it still shows high cpu utilization.
> >> > >
> >> > > It would seem then that even the briefest lock-protected critical
> >> > > section would cause this? In the mmio case, the p2m lock taken in
> >> > > the hap fault handler is held during the actual lookup, and for a
> >> > > couple of branch instructions afterwards.
> >> > >
> >> > > In latest Xen, with lock removed for get_gfn, on which lock is
> >> > > time
> >> spent?
> >> > Still the p2m_lock.
> >>
> >> Can you please try the attached patch? I think you'll need this one
> >> plus the ones that take the locks out of the hpet code.
> >>
> >> This patch makes the p2m lock into an rwlock and adjusts a number of
> >> the paths that don't update the p2m so they only take the read lock.
> >> It's a bit rough but I can boot 16-way win7 guest with it.
>
> That is great news.
>
> Tim, thanks for the amazing work. I'm poring over the patch presently, and will
> shoot at it with everything I've got testing-wise.
>
> I'm taking the liberty of pulling in Olaf (paging) and George (PoD) as the
> changeset affects those subsystems.

But win8 guest shows BSOD with 32 VCPUs. :(
The reason of BSOD is : SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (ACPI.sys)


best regards
yang


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


andres at lagarcavilla

Apr 26, 2012, 8:02 PM

Post #37 of 45 (205 views)
Permalink
Re: lock in vhpet [In reply to]

> At 02:36 +0000 on 25 Apr (1335321409), Zhang, Yang Z wrote:
>> > > But actually, the first cs introduced this issue is 24770. When win8
>> > > booting and if hpet is enabled, it will use hpet as the time source
>> > > and there have lots of hpet access and EPT violation. In EPT
>> violation
>> > > handler, it call get_gfn_type_access to get the mfn. The cs 24770
>> > > introduces the gfn_lock for p2m lookups, and then the issue happens.
>> > > After I removed the gfn_lock, the issue goes. But in latest xen,
>> even
>> > > I remove this lock, it still shows high cpu utilization.
>> >
>> > It would seem then that even the briefest lock-protected critical
>> section would
>> > cause this? In the mmio case, the p2m lock taken in the hap fault
>> handler is
>> > held during the actual lookup, and for a couple of branch instructions
>> > afterwards.
>> >
>> > In latest Xen, with lock removed for get_gfn, on which lock is time
>> spent?
>> Still the p2m_lock.
>
> Can you please try the attached patch? I think you'll need this one
> plus the ones that take the locks out of the hpet code.

Right off the bat I'm getting a multitude of
(XEN) mm.c:3294:d0 Error while clearing mfn 100cbb7
And a hung dom0 during initramfs. I'm a little baffled as to why, but it's
there (32 bit dom0, XenServer6).

>
> This patch makes the p2m lock into an rwlock and adjusts a number of the
> paths that don't update the p2m so they only take the read lock. It's a
> bit rough but I can boot 16-way win7 guest with it.
>
> N.B. Since rwlocks don't show up the the existing lock profiling, please
> don't try to use the lock-profiling numbers to see if it's helping!
>
> Andres, this is basically the big-hammer version of your "take a
> pagecount" changes, plus the change you made to hvmemul_rep_movs().
> If this works I intend to follow it up with a patch to make some of the
> read-modify-write paths avoid taking the lock (by using a
> compare-exchange operation so they only take the lock on a write). If
> that succeeds I might drop put_gfn() altogether.

You mean cmpxchg the whole p2m entry? I don't think I parse the plan.
There are code paths that do get_gfn_query -> p2m_change_type -> put_gfn.
But I guess those could lock the p2m up-front if they become the only
consumers of put_gfn left.

>
> But first it will need a lot of tidying up. Noticeably missing:
> - SVM code equivalents to the vmx.c changes

load_pdptrs doesn't exist on svm. There are a couple of debugging
get_gfn_query that can be made unlocked. And svm_vmcb_restore needs
similar treatment to what you did in vmx.c. The question is who will be
able to test it...

> - grant-table operations still use the lock, because frankly I
> could not follow the current code, and it's quite late in the evening.

It's pretty complex with serious nesting, and ifdef's for arm and 32 bit.
gfn_to_mfn_private callers will suffer from altering the current meaning,
as put_gfn resolves to the right thing for the ifdef'ed arch. The other
user is grant_transfer which also relies on the page *not* having an extra
ref in steal_page. So it's a prime candidate to be left alone.

> I also have a long list of uglinesses in the mm code that I found

Uh, ugly stuff, how could that have happened?

I have a few preliminary observations on the patch. Pasting relevant bits
here, since the body of the patch seems to have been lost by the email
thread:

@@ -977,23 +976,25 @@ int arch_set_info_guest(
...
+
+ if (!paging_mode_refcounts(d)
+ && !get_page_and_type(cr3_page, d, PGT_l3_page_table) )
replace with && !get_page_type() )

@@ -2404,32 +2373,33 @@ static enum hvm_copy_result __hvm_copy(
gfn = addr >> PAGE_SHIFT;
}

- mfn = mfn_x(get_gfn_unshare(curr->domain, gfn, &p2mt));
+ page = get_page_from_gfn(curr->domain, gfn, &p2mt, P2M_UNSHARE);
replace with (flags & HVMCOPY_to_guest) ? P2M_UNSHARE : P2M_ALLOC (and
same logic when checking p2m_is_shared). Not truly related to your patch
bit since we're at it.

Same, further down
- if ( !p2m_is_ram(p2mt) )
+ if ( !page )
{
- put_gfn(curr->domain, gfn);
+ if ( page )
+ put_page(page);
Last two lines are redundant

@@ -4019,35 +3993,16 @@ long do_hvm_op(unsigned long op, XEN_GUE
case HVMOP_modified_memory: a lot of error checking has been removed.
At the very least:
if ( page )
{ ...
} else {
rc = -EINVAL;
goto param_fail3;
}

arch/x86/mm.c:do_mmu_update -> you blew up all the paging/sharing checking
for target gfns of mmu updates of l2/3/4 entries. It seems that this
wouldn't work anyways, that's why you killed it?

+++ b/xen/arch/x86/mm/hap/guest_walk.c
@@ -54,34 +54,37 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA
...
+ if ( !top_page )
{
pfec[0] &= ~PFEC_page_present;
- __put_gfn(p2m, top_gfn);
+ put_page(top_page);
top_page is NULL here, remove put_page

get_page_from_gfn_p2m, slow path: no need for p2m_lock/unlock since
locking is already done by get_gfn_type_access/__put_gfn

(hope those observations made sense without inlining them in the actual
patch)

Andres




_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


yang.z.zhang at intel

Apr 27, 2012, 1:36 AM

Post #38 of 45 (211 views)
Permalink
Re: lock in vhpet [In reply to]

> -----Original Message-----
> From: Zhang, Yang Z
> Sent: Friday, April 27, 2012 9:25 AM
> To: andres [at] lagarcavilla
> Cc: Tim Deegan; Keir Fraser; xen-devel [at] lists; olaf [at] aepfle;
> George.Dunlap [at] eu
> Subject: RE: [Xen-devel] lock in vhpet
>
>
> > -----Original Message-----
> > From: Andres Lagar-Cavilla [mailto:andres [at] lagarcavilla]
> > Sent: Friday, April 27, 2012 8:52 AM
> > To: Zhang, Yang Z
> > Cc: Tim Deegan; Keir Fraser; xen-devel [at] lists;
> > olaf [at] aepfle; George.Dunlap [at] eu
> > Subject: RE: [Xen-devel] lock in vhpet
> >
> > >> -----Original Message-----
> > >> From: Tim Deegan [mailto:tim [at] xen]
> > >> Sent: Friday, April 27, 2012 5:26 AM
> > >> To: Zhang, Yang Z
> > >> Cc: andres [at] lagarcavilla; Keir Fraser;
> > >> xen-devel [at] lists
> > >> Subject: Re: [Xen-devel] lock in vhpet
> > >>
> > >> At 02:36 +0000 on 25 Apr (1335321409), Zhang, Yang Z wrote:
> > >> > > > But actually, the first cs introduced this issue is 24770.
> > >> > > > When
> > >> > > > win8 booting and if hpet is enabled, it will use hpet as the
> > >> > > > time source and there have lots of hpet access and EPT
> > >> > > > violation. In EPT violation handler, it call
> > >> > > > get_gfn_type_access to get
> > the mfn.
> > >> > > > The cs 24770 introduces the gfn_lock for p2m lookups, and
> > >> > > > then the
> > >> issue
> > >> happens.
> > >> > > > After I removed the gfn_lock, the issue goes. But in latest
> > >> > > > xen, even I remove this lock, it still shows high cpu utilization.
> > >> > >
> > >> > > It would seem then that even the briefest lock-protected
> > >> > > critical section would cause this? In the mmio case, the p2m
> > >> > > lock taken in the hap fault handler is held during the actual
> > >> > > lookup, and for a couple of branch instructions afterwards.
> > >> > >
> > >> > > In latest Xen, with lock removed for get_gfn, on which lock is
> > >> > > time
> > >> spent?
> > >> > Still the p2m_lock.
> > >>
> > >> Can you please try the attached patch? I think you'll need this
> > >> one plus the ones that take the locks out of the hpet code.
> > >>
> > >> This patch makes the p2m lock into an rwlock and adjusts a number
> > >> of the paths that don't update the p2m so they only take the read lock.
> > >> It's a bit rough but I can boot 16-way win7 guest with it.
> >
> > That is great news.
> >
> > Tim, thanks for the amazing work. I'm poring over the patch presently,
> > and will shoot at it with everything I've got testing-wise.
> >
> > I'm taking the liberty of pulling in Olaf (paging) and George (PoD) as
> > the changeset affects those subsystems.
>
> But win8 guest shows BSOD with 32 VCPUs. :( The reason of BSOD is :
> SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (ACPI.sys)
>
Um....., I find this issue is related to xl not hypervisor.
Will send a patch to fix it later.

yang



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


tim at xen

Apr 27, 2012, 2:26 AM

Post #39 of 45 (207 views)
Permalink
Re: lock in vhpet [In reply to]

At 20:02 -0700 on 26 Apr (1335470547), Andres Lagar-Cavilla wrote:
> > Can you please try the attached patch? I think you'll need this one
> > plus the ones that take the locks out of the hpet code.
>
> Right off the bat I'm getting a multitude of
> (XEN) mm.c:3294:d0 Error while clearing mfn 100cbb7
> And a hung dom0 during initramfs. I'm a little baffled as to why, but it's
> there (32 bit dom0, XenServer6).

Curses, I knew there'd be one somewhere. I've been replacing
get_page_and_type_from_pagenr()s (which return 0 for success) with
old-school get_page_type()s (which return 1 for success) and not always
getting the right number of inversions. That's a horrible horrible
beartrap of an API, BTW, which had me cursing at the screen, but I had
enough on my plate yesterday without touching _that_ code too!

> > Andres, this is basically the big-hammer version of your "take a
> > pagecount" changes, plus the change you made to hvmemul_rep_movs().
> > If this works I intend to follow it up with a patch to make some of the
> > read-modify-write paths avoid taking the lock (by using a
> > compare-exchange operation so they only take the lock on a write). If
> > that succeeds I might drop put_gfn() altogether.
>
> You mean cmpxchg the whole p2m entry? I don't think I parse the plan.
> There are code paths that do get_gfn_query -> p2m_change_type -> put_gfn.
> But I guess those could lock the p2m up-front if they become the only
> consumers of put_gfn left.

Well, that's more or less what happens now. I was thinking of replacing
any remaining

(implicit) lock ; read ; think a bit ; maybe write ; unlock

code with the fast-path-friendlier:

read ; think ; maybe-cmpxchg (and on failure undo or retry

which avoids taking the write lock altogether if there's no work to do.
But maybe there aren't many of those left now. Obviously any path
which will always write should just take the write-lock first.

> > - grant-table operations still use the lock, because frankly I
> > could not follow the current code, and it's quite late in the evening.
>
> It's pretty complex with serious nesting, and ifdef's for arm and 32 bit.
> gfn_to_mfn_private callers will suffer from altering the current meaning,
> as put_gfn resolves to the right thing for the ifdef'ed arch. The other
> user is grant_transfer which also relies on the page *not* having an extra
> ref in steal_page. So it's a prime candidate to be left alone.

Sadly, I think it's not. The PV backends will be doing lots of grant
ops, which shouldn't get serialized against all other P2M lookups.

> > I also have a long list of uglinesses in the mm code that I found
>
> Uh, ugly stuff, how could that have happened?

I can't imagine. :) Certainly nothing to do with me thinking "I'll
clean that up when I get some time."

> I have a few preliminary observations on the patch. Pasting relevant bits
> here, since the body of the patch seems to have been lost by the email
> thread:
>
> @@ -977,23 +976,25 @@ int arch_set_info_guest(
> ...
> +
> + if (!paging_mode_refcounts(d)
> + && !get_page_and_type(cr3_page, d, PGT_l3_page_table) )
> replace with && !get_page_type() )

Yep.

> @@ -2404,32 +2373,33 @@ static enum hvm_copy_result __hvm_copy(
> gfn = addr >> PAGE_SHIFT;
> }
>
> - mfn = mfn_x(get_gfn_unshare(curr->domain, gfn, &p2mt));
> + page = get_page_from_gfn(curr->domain, gfn, &p2mt, P2M_UNSHARE);
> replace with (flags & HVMCOPY_to_guest) ? P2M_UNSHARE : P2M_ALLOC (and
> same logic when checking p2m_is_shared). Not truly related to your patch
> bit since we're at it.

OK, but not in this patch.

> Same, further down
> - if ( !p2m_is_ram(p2mt) )
> + if ( !page )
> {
> - put_gfn(curr->domain, gfn);
> + if ( page )
> + put_page(page);
> Last two lines are redundant

Yep.

> @@ -4019,35 +3993,16 @@ long do_hvm_op(unsigned long op, XEN_GUE
> case HVMOP_modified_memory: a lot of error checking has been removed.

Yes, but it was bogus - there's a race between the actual modification
and the call, during which anything might have happened. The best we
can do is throw log-dirty bits at everything, and the caller can't do
anything with the error anyway.

When I come to tidy up I'll just add a new mark_gfn_dirty function
and skip the pointless gfn->mfn->gfn translation on this path.

> arch/x86/mm.c:do_mmu_update -> you blew up all the paging/sharing checking
> for target gfns of mmu updates of l2/3/4 entries. It seems that this
> wouldn't work anyways, that's why you killed it?

Yeah - since only L1es can point at foreign mappings it was all just
noise, and even if there had been real p2m lookups on those paths there
was no equivalent to the translate-in-place that happens in
mod_l1_entry so it would have been broken in a much worse way.

> +++ b/xen/arch/x86/mm/hap/guest_walk.c
> @@ -54,34 +54,37 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA
> ...
> + if ( !top_page )
> {
> pfec[0] &= ~PFEC_page_present;
> - __put_gfn(p2m, top_gfn);
> + put_page(top_page);
> top_page is NULL here, remove put_page

Yep.

> get_page_from_gfn_p2m, slow path: no need for p2m_lock/unlock since
> locking is already done by get_gfn_type_access/__put_gfn

Yeah, but I was writing that with half an eye on killing that lock. :)
I'll drop them for now.

> (hope those observations made sense without inlining them in the actual
> patch)

Yes, absolutely - thanks for the review!

If we can get this to work well enough I'll tidy it up into a sensible
series next week. In the meantime, an updated verison of the
monster patch is attached.

Cheers,

Tim.
Attachments: get-page-from-gfn (74.1 KB)


andres at lagarcavilla

Apr 27, 2012, 7:17 AM

Post #40 of 45 (206 views)
Permalink
Re: lock in vhpet [In reply to]

> At 20:02 -0700 on 26 Apr (1335470547), Andres Lagar-Cavilla wrote:
>> > Can you please try the attached patch? I think you'll need this one
>> > plus the ones that take the locks out of the hpet code.
>>
>> Right off the bat I'm getting a multitude of
>> (XEN) mm.c:3294:d0 Error while clearing mfn 100cbb7
>> And a hung dom0 during initramfs. I'm a little baffled as to why, but
>> it's
>> there (32 bit dom0, XenServer6).
>
> Curses, I knew there'd be one somewhere. I've been replacing
> get_page_and_type_from_pagenr()s (which return 0 for success) with
> old-school get_page_type()s (which return 1 for success) and not always
> getting the right number of inversions. That's a horrible horrible
> beartrap of an API, BTW, which had me cursing at the screen, but I had
> enough on my plate yesterday without touching _that_ code too!

Found more bugs. Some were predating this (xsm not calling put_gfn after
get_gfn_untyped, etc). Some were new (get_gfn_untyped not arch
independent).

I will be sending you shortly a small patch series on top of your mosnter
patch purely for RFC. Feel free to merge all, pick out bug fixes, etc.

Andres
>
>> > Andres, this is basically the big-hammer version of your "take a
>> > pagecount" changes, plus the change you made to hvmemul_rep_movs().
>> > If this works I intend to follow it up with a patch to make some of
>> the
>> > read-modify-write paths avoid taking the lock (by using a
>> > compare-exchange operation so they only take the lock on a write). If
>> > that succeeds I might drop put_gfn() altogether.
>>
>> You mean cmpxchg the whole p2m entry? I don't think I parse the plan.
>> There are code paths that do get_gfn_query -> p2m_change_type ->
>> put_gfn.
>> But I guess those could lock the p2m up-front if they become the only
>> consumers of put_gfn left.
>
> Well, that's more or less what happens now. I was thinking of replacing
> any remaining
>
> (implicit) lock ; read ; think a bit ; maybe write ; unlock
>
> code with the fast-path-friendlier:
>
> read ; think ; maybe-cmpxchg (and on failure undo or retry
>
> which avoids taking the write lock altogether if there's no work to do.
> But maybe there aren't many of those left now. Obviously any path
> which will always write should just take the write-lock first.
>
>> > - grant-table operations still use the lock, because frankly I
>> > could not follow the current code, and it's quite late in the
>> evening.
>>
>> It's pretty complex with serious nesting, and ifdef's for arm and 32
>> bit.
>> gfn_to_mfn_private callers will suffer from altering the current
>> meaning,
>> as put_gfn resolves to the right thing for the ifdef'ed arch. The other
>> user is grant_transfer which also relies on the page *not* having an
>> extra
>> ref in steal_page. So it's a prime candidate to be left alone.
>
> Sadly, I think it's not. The PV backends will be doing lots of grant
> ops, which shouldn't get serialized against all other P2M lookups.
>
>> > I also have a long list of uglinesses in the mm code that I found
>>
>> Uh, ugly stuff, how could that have happened?
>
> I can't imagine. :) Certainly nothing to do with me thinking "I'll
> clean that up when I get some time."
>
>> I have a few preliminary observations on the patch. Pasting relevant
>> bits
>> here, since the body of the patch seems to have been lost by the email
>> thread:
>>
>> @@ -977,23 +976,25 @@ int arch_set_info_guest(
>> ...
>> +
>> + if (!paging_mode_refcounts(d)
>> + && !get_page_and_type(cr3_page, d, PGT_l3_page_table) )
>> replace with && !get_page_type() )
>
> Yep.
>
>> @@ -2404,32 +2373,33 @@ static enum hvm_copy_result __hvm_copy(
>> gfn = addr >> PAGE_SHIFT;
>> }
>>
>> - mfn = mfn_x(get_gfn_unshare(curr->domain, gfn, &p2mt));
>> + page = get_page_from_gfn(curr->domain, gfn, &p2mt,
>> P2M_UNSHARE);
>> replace with (flags & HVMCOPY_to_guest) ? P2M_UNSHARE : P2M_ALLOC (and
>> same logic when checking p2m_is_shared). Not truly related to your patch
>> bit since we're at it.
>
> OK, but not in this patch.
>
>> Same, further down
>> - if ( !p2m_is_ram(p2mt) )
>> + if ( !page )
>> {
>> - put_gfn(curr->domain, gfn);
>> + if ( page )
>> + put_page(page);
>> Last two lines are redundant
>
> Yep.
>
>> @@ -4019,35 +3993,16 @@ long do_hvm_op(unsigned long op, XEN_GUE
>> case HVMOP_modified_memory: a lot of error checking has been
>> removed.
>
> Yes, but it was bogus - there's a race between the actual modification
> and the call, during which anything might have happened. The best we
> can do is throw log-dirty bits at everything, and the caller can't do
> anything with the error anyway.
>
> When I come to tidy up I'll just add a new mark_gfn_dirty function
> and skip the pointless gfn->mfn->gfn translation on this path.
>
>> arch/x86/mm.c:do_mmu_update -> you blew up all the paging/sharing
>> checking
>> for target gfns of mmu updates of l2/3/4 entries. It seems that this
>> wouldn't work anyways, that's why you killed it?
>
> Yeah - since only L1es can point at foreign mappings it was all just
> noise, and even if there had been real p2m lookups on those paths there
> was no equivalent to the translate-in-place that happens in
> mod_l1_entry so it would have been broken in a much worse way.
>
>> +++ b/xen/arch/x86/mm/hap/guest_walk.c
>> @@ -54,34 +54,37 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA
>> ...
>> + if ( !top_page )
>> {
>> pfec[0] &= ~PFEC_page_present;
>> - __put_gfn(p2m, top_gfn);
>> + put_page(top_page);
>> top_page is NULL here, remove put_page
>
> Yep.
>
>> get_page_from_gfn_p2m, slow path: no need for p2m_lock/unlock since
>> locking is already done by get_gfn_type_access/__put_gfn
>
> Yeah, but I was writing that with half an eye on killing that lock. :)
> I'll drop them for now.
>
>> (hope those observations made sense without inlining them in the actual
>> patch)
>
> Yes, absolutely - thanks for the review!
>
> If we can get this to work well enough I'll tidy it up into a sensible
> series next week. In the meantime, an updated verison of the
> monster patch is attached.
>
> Cheers,
>
> Tim.
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


andres at lagarcavilla

Apr 27, 2012, 2:08 PM

Post #41 of 45 (207 views)
Permalink
Re: lock in vhpet [In reply to]

> At 20:02 -0700 on 26 Apr (1335470547), Andres Lagar-Cavilla wrote:
>> > Can you please try the attached patch? I think you'll need this one
>> > plus the ones that take the locks out of the hpet code.
>>
>> Right off the bat I'm getting a multitude of
>> (XEN) mm.c:3294:d0 Error while clearing mfn 100cbb7
>> And a hung dom0 during initramfs. I'm a little baffled as to why, but
>> it's
>> there (32 bit dom0, XenServer6).
>
> Curses, I knew there'd be one somewhere. I've been replacing
> get_page_and_type_from_pagenr()s (which return 0 for success) with
> old-school get_page_type()s (which return 1 for success) and not always
> getting the right number of inversions. That's a horrible horrible
> beartrap of an API, BTW, which had me cursing at the screen, but I had
> enough on my plate yesterday without touching _that_ code too!
>

I now am quite pleased with the testing results on our end. I have a
four-patch series to top up your monster patch, which I'll submit shortly.
I encourage everyone interested to test this (obviously a lot of code is
touched). Including AMD, as I've expanded the code to touch svm too.

>> > Andres, this is basically the big-hammer version of your "take a
>> > pagecount" changes, plus the change you made to hvmemul_rep_movs().
>> > If this works I intend to follow it up with a patch to make some of
>> the
>> > read-modify-write paths avoid taking the lock (by using a
>> > compare-exchange operation so they only take the lock on a write). If
>> > that succeeds I might drop put_gfn() altogether.
>>
>> You mean cmpxchg the whole p2m entry? I don't think I parse the plan.
>> There are code paths that do get_gfn_query -> p2m_change_type ->
>> put_gfn.
>> But I guess those could lock the p2m up-front if they become the only
>> consumers of put_gfn left.
>
> Well, that's more or less what happens now. I was thinking of replacing
> any remaining
>
> (implicit) lock ; read ; think a bit ; maybe write ; unlock
>
> code with the fast-path-friendlier:
>
> read ; think ; maybe-cmpxchg (and on failure undo or retry
>
> which avoids taking the write lock altogether if there's no work to do.
> But maybe there aren't many of those left now. Obviously any path
> which will always write should just take the write-lock first.

After my four patches there aren't really any paths like the above left
(exhaustion disclaimer). I believe one or two iterative paths (like
HVMOP_set_mem_type) could be optimized to take the p2m lock up front,
instead of many get_gfn's.

The nice thing about get_gfn/put_gfn is that it will allow for seamless
(har har) transition to a fine-grained p2m. But then maybe we won't ever
need that with a p2m rwlock.

>
>> > - grant-table operations still use the lock, because frankly I
>> > could not follow the current code, and it's quite late in the
>> evening.
>>
>> It's pretty complex with serious nesting, and ifdef's for arm and 32
>> bit.
>> gfn_to_mfn_private callers will suffer from altering the current
>> meaning,
>> as put_gfn resolves to the right thing for the ifdef'ed arch. The other
>> user is grant_transfer which also relies on the page *not* having an
>> extra
>> ref in steal_page. So it's a prime candidate to be left alone.
>
> Sadly, I think it's not. The PV backends will be doing lots of grant
> ops, which shouldn't get serialized against all other P2M lookups.
>

Those are addressed in my patch series now, which should case waves of panic.

Andres

>> > I also have a long list of uglinesses in the mm code that I found
>>
>> Uh, ugly stuff, how could that have happened?
>
> I can't imagine. :) Certainly nothing to do with me thinking "I'll
> clean that up when I get some time."
>
>> I have a few preliminary observations on the patch. Pasting relevant
>> bits
>> here, since the body of the patch seems to have been lost by the email
>> thread:
>>
>> @@ -977,23 +976,25 @@ int arch_set_info_guest(
>> ...
>> +
>> + if (!paging_mode_refcounts(d)
>> + && !get_page_and_type(cr3_page, d, PGT_l3_page_table) )
>> replace with && !get_page_type() )
>
> Yep.
>
>> @@ -2404,32 +2373,33 @@ static enum hvm_copy_result __hvm_copy(
>> gfn = addr >> PAGE_SHIFT;
>> }
>>
>> - mfn = mfn_x(get_gfn_unshare(curr->domain, gfn, &p2mt));
>> + page = get_page_from_gfn(curr->domain, gfn, &p2mt,
>> P2M_UNSHARE);
>> replace with (flags & HVMCOPY_to_guest) ? P2M_UNSHARE : P2M_ALLOC (and
>> same logic when checking p2m_is_shared). Not truly related to your patch
>> bit since we're at it.
>
> OK, but not in this patch.
>
>> Same, further down
>> - if ( !p2m_is_ram(p2mt) )
>> + if ( !page )
>> {
>> - put_gfn(curr->domain, gfn);
>> + if ( page )
>> + put_page(page);
>> Last two lines are redundant
>
> Yep.
>
>> @@ -4019,35 +3993,16 @@ long do_hvm_op(unsigned long op, XEN_GUE
>> case HVMOP_modified_memory: a lot of error checking has been
>> removed.
>
> Yes, but it was bogus - there's a race between the actual modification
> and the call, during which anything might have happened. The best we
> can do is throw log-dirty bits at everything, and the caller can't do
> anything with the error anyway.
>
> When I come to tidy up I'll just add a new mark_gfn_dirty function
> and skip the pointless gfn->mfn->gfn translation on this path.
>
>> arch/x86/mm.c:do_mmu_update -> you blew up all the paging/sharing
>> checking
>> for target gfns of mmu updates of l2/3/4 entries. It seems that this
>> wouldn't work anyways, that's why you killed it?
>
> Yeah - since only L1es can point at foreign mappings it was all just
> noise, and even if there had been real p2m lookups on those paths there
> was no equivalent to the translate-in-place that happens in
> mod_l1_entry so it would have been broken in a much worse way.
>
>> +++ b/xen/arch/x86/mm/hap/guest_walk.c
>> @@ -54,34 +54,37 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA
>> ...
>> + if ( !top_page )
>> {
>> pfec[0] &= ~PFEC_page_present;
>> - __put_gfn(p2m, top_gfn);
>> + put_page(top_page);
>> top_page is NULL here, remove put_page
>
> Yep.
>
>> get_page_from_gfn_p2m, slow path: no need for p2m_lock/unlock since
>> locking is already done by get_gfn_type_access/__put_gfn
>
> Yeah, but I was writing that with half an eye on killing that lock. :)
> I'll drop them for now.
>
>> (hope those observations made sense without inlining them in the actual
>> patch)
>
> Yes, absolutely - thanks for the review!
>
> If we can get this to work well enough I'll tidy it up into a sensible
> series next week. In the meantime, an updated verison of the
> monster patch is attached.
>
> Cheers,
>
> Tim.
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


yang.z.zhang at intel

May 16, 2012, 4:36 AM

Post #42 of 45 (181 views)
Permalink
Re: lock in vhpet [In reply to]

Hi tim,

Did the attached patch apply to upstream xen? I tried the latest xen and still saw the high cpu utilization.

best regards
yang

> -----Original Message-----
> From: Tim Deegan [mailto:tim [at] xen]
> Sent: Friday, April 27, 2012 5:26 AM
> To: Zhang, Yang Z
> Cc: andres [at] lagarcavilla; Keir Fraser; xen-devel [at] lists
> Subject: Re: [Xen-devel] lock in vhpet
>
> At 02:36 +0000 on 25 Apr (1335321409), Zhang, Yang Z wrote:
> > > > But actually, the first cs introduced this issue is 24770. When win8
> > > > booting and if hpet is enabled, it will use hpet as the time source
> > > > and there have lots of hpet access and EPT violation. In EPT violation
> > > > handler, it call get_gfn_type_access to get the mfn. The cs 24770
> > > > introduces the gfn_lock for p2m lookups, and then the issue happens.
> > > > After I removed the gfn_lock, the issue goes. But in latest xen, even
> > > > I remove this lock, it still shows high cpu utilization.
> > >
> > > It would seem then that even the briefest lock-protected critical section
> would
> > > cause this? In the mmio case, the p2m lock taken in the hap fault handler is
> > > held during the actual lookup, and for a couple of branch instructions
> > > afterwards.
> > >
> > > In latest Xen, with lock removed for get_gfn, on which lock is time spent?
> > Still the p2m_lock.
>
> Can you please try the attached patch? I think you'll need this one
> plus the ones that take the locks out of the hpet code.
>
> This patch makes the p2m lock into an rwlock and adjusts a number of the
> paths that don't update the p2m so they only take the read lock. It's a
> bit rough but I can boot 16-way win7 guest with it.
>
> N.B. Since rwlocks don't show up the the existing lock profiling, please
> don't try to use the lock-profiling numbers to see if it's helping!
>
> Andres, this is basically the big-hammer version of your "take a
> pagecount" changes, plus the change you made to hvmemul_rep_movs().
> If this works I intend to follow it up with a patch to make some of the
> read-modify-write paths avoid taking the lock (by using a
> compare-exchange operation so they only take the lock on a write). If
> that succeeds I might drop put_gfn() altogether.
>
> But first it will need a lot of tidying up. Noticeably missing:
> - SVM code equivalents to the vmx.c changes
> - grant-table operations still use the lock, because frankly I
> could not follow the current code, and it's quite late in the evening.
> I also have a long list of uglinesses in the mm code that I found while
> writing this lot.
>
> Keir, I have no objection to later replacing this with something better
> than an rwlock. :) Or with making a NUMA-friendly rwlock
> implementation, since I really expect this to be heavily read-mostly
> when paging/sharing/pod are not enabled.
>
> Cheers,
>
> Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


tim at xen

May 16, 2012, 5:36 AM

Post #43 of 45 (184 views)
Permalink
Re: lock in vhpet [In reply to]

Hi,

At 11:36 +0000 on 16 May (1337168174), Zhang, Yang Z wrote:
> Hi tim,
>
> Did the attached patch apply to upstream xen? I tried the latest xen
> and still saw the high cpu utilization.

The patch is not yet applied. It's been cleaned up into a patch series
that I posted last Thursday, and will probably be applied later this
week.

Cheers,

Tim.

> best regards
> yang
>
> > -----Original Message-----
> > From: Tim Deegan [mailto:tim [at] xen]
> > Sent: Friday, April 27, 2012 5:26 AM
> > To: Zhang, Yang Z
> > Cc: andres [at] lagarcavilla; Keir Fraser; xen-devel [at] lists
> > Subject: Re: [Xen-devel] lock in vhpet
> >
> > At 02:36 +0000 on 25 Apr (1335321409), Zhang, Yang Z wrote:
> > > > > But actually, the first cs introduced this issue is 24770. When win8
> > > > > booting and if hpet is enabled, it will use hpet as the time source
> > > > > and there have lots of hpet access and EPT violation. In EPT violation
> > > > > handler, it call get_gfn_type_access to get the mfn. The cs 24770
> > > > > introduces the gfn_lock for p2m lookups, and then the issue happens.
> > > > > After I removed the gfn_lock, the issue goes. But in latest xen, even
> > > > > I remove this lock, it still shows high cpu utilization.
> > > >
> > > > It would seem then that even the briefest lock-protected critical section
> > would
> > > > cause this? In the mmio case, the p2m lock taken in the hap fault handler is
> > > > held during the actual lookup, and for a couple of branch instructions
> > > > afterwards.
> > > >
> > > > In latest Xen, with lock removed for get_gfn, on which lock is time spent?
> > > Still the p2m_lock.
> >
> > Can you please try the attached patch? I think you'll need this one
> > plus the ones that take the locks out of the hpet code.
> >
> > This patch makes the p2m lock into an rwlock and adjusts a number of the
> > paths that don't update the p2m so they only take the read lock. It's a
> > bit rough but I can boot 16-way win7 guest with it.
> >
> > N.B. Since rwlocks don't show up the the existing lock profiling, please
> > don't try to use the lock-profiling numbers to see if it's helping!
> >
> > Andres, this is basically the big-hammer version of your "take a
> > pagecount" changes, plus the change you made to hvmemul_rep_movs().
> > If this works I intend to follow it up with a patch to make some of the
> > read-modify-write paths avoid taking the lock (by using a
> > compare-exchange operation so they only take the lock on a write). If
> > that succeeds I might drop put_gfn() altogether.
> >
> > But first it will need a lot of tidying up. Noticeably missing:
> > - SVM code equivalents to the vmx.c changes
> > - grant-table operations still use the lock, because frankly I
> > could not follow the current code, and it's quite late in the evening.
> > I also have a long list of uglinesses in the mm code that I found while
> > writing this lot.
> >
> > Keir, I have no objection to later replacing this with something better
> > than an rwlock. :) Or with making a NUMA-friendly rwlock
> > implementation, since I really expect this to be heavily read-mostly
> > when paging/sharing/pod are not enabled.
> >
> > Cheers,
> >
> > Tim.
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


tim at xen

May 17, 2012, 3:57 AM

Post #44 of 45 (181 views)
Permalink
Re: lock in vhpet [In reply to]

Hi,

At 13:36 +0100 on 16 May (1337175361), Tim Deegan wrote:
> At 11:36 +0000 on 16 May (1337168174), Zhang, Yang Z wrote:
> > Did the attached patch apply to upstream xen? I tried the latest xen
> > and still saw the high cpu utilization.
>
> The patch is not yet applied. It's been cleaned up into a patch series
> that I posted last Thursday, and will probably be applied later this
> week.

It's now been applied to the staging tree, as csets 25350--25360.

Cheers,

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


yang.z.zhang at intel

May 27, 2012, 11:54 PM

Post #45 of 45 (183 views)
Permalink
Re: lock in vhpet [In reply to]

> -----Original Message-----
> From: Tim Deegan [mailto:tim [at] xen]
> Sent: Thursday, May 17, 2012 6:58 PM
> To: Zhang, Yang Z
> Cc: xen-devel [at] lists; Keir Fraser; andres [at] lagarcavilla
> Subject: Re: [Xen-devel] lock in vhpet
>
> Hi,
>
> At 13:36 +0100 on 16 May (1337175361), Tim Deegan wrote:
> > At 11:36 +0000 on 16 May (1337168174), Zhang, Yang Z wrote:
> > > Did the attached patch apply to upstream xen? I tried the latest xen
> > > and still saw the high cpu utilization.
> >
> > The patch is not yet applied. It's been cleaned up into a patch series
> > that I posted last Thursday, and will probably be applied later this
> > week.
>
> It's now been applied to the staging tree, as csets 25350--25360.

It's a great work! With one week's testing, we didn't find any regression with those patches.

best regards
yang

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel

First page Previous page 1 2 Next page Last page  View All Xen devel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.