konrad.wilk at oracle
May 7, 2012, 9:07 AM
Post #8 of 26
On Mon, May 07, 2012 at 08:15:44AM +0100, Jan Beulich wrote:
Re: xsave=0 workaround needed on 3.2 kernels with Xen 4.1 or Xen-unstable.
[In reply to]
> >>> On 04.05.12 at 21:30, AP <apxeng [at] gmail> wrote:
> > From the above I realized that X86_CR4_OSXSAVE was never getting set
> > in v->arch.pv_vcpu.ctrlreg.
> Yes, that was the observation in the previous thread too, but the
> reporter didn't seem interested in continuing on from there.
> > So I tried the following patch:
> > diff -r 5a0d60bb536b xen/arch/x86/domain.c
> > --- a/xen/arch/x86/domain.c Fri Apr 27 21:10:59 2012 -0700
> > +++ b/xen/arch/x86/domain.c Fri May 04 12:23:57 2012 -0700
> > @@ -691,8 +691,6 @@ unsigned long pv_guest_cr4_fixup(const s
> > hv_cr4_mask &= ~X86_CR4_DE;
> > if ( cpu_has_fsgsbase && !is_pv_32bit_domain(v->domain) )
> > hv_cr4_mask &= ~X86_CR4_FSGSBASE;
> > - if ( xsave_enabled(v) )
> > - hv_cr4_mask &= ~X86_CR4_OSXSAVE;
> > if ( (guest_cr4 & hv_cr4_mask) != (hv_cr4 & hv_cr4_mask) )
> > gdprintk(XENLOG_WARNING,
> > diff -r 5a0d60bb536b xen/include/asm-x86/domain.h
> > --- a/xen/include/asm-x86/domain.h Fri Apr 27 21:10:59 2012 -0700
> > +++ b/xen/include/asm-x86/domain.h Fri May 04 12:23:57 2012 -0700
> > @@ -530,7 +530,7 @@ unsigned long pv_guest_cr4_fixup(const s
> > & ~X86_CR4_DE)
> > #define real_cr4_to_pv_guest_cr4(c) \
> > ((c) & ~(X86_CR4_PGE | X86_CR4_PSE | X86_CR4_TSD \
> > - | X86_CR4_OSXSAVE | X86_CR4_SMEP))
> > + | X86_CR4_SMEP))
> > void domain_cpuid(struct domain *d,
> > unsigned int input,
> No, this is specifically the wrong thing. From what we know so far
> (i.e. the outcome of the above printing you added) the problem in
> in the Dom0 kernel (in it never setting CR4.OSXSAVE prior to
> attempting XSETBV). What your patch efectively does is take away
> control from the guest kernels to control the (virtual) CR4 flag...
> > That allowed the system to boot successfully though I did see the
> > following message:
> > (XEN) domain.c:698:d0 Attempt to change CR4 flags 00042660 -> 00002660
> ... which is what this message is telling you.
> > Not sure if the above patch is right fix but I hope it was at least
> > helpful in pointing at where the problem might be.
> > BTW, I see the same invalid op issue with Xen 4.1.2 if I boot with xsave=1.
> Sure, as it's a kernel problem. It's the kernel that needs logging added,
> to find out why the CR4 write supposedly happening immediately
> prior to the XSETBV (set_in_cr4(X86_CR4_OSXSAVE)) doesn't actually
> happen, or doesn't set the flag. Perhaps something fishy going on
> with the paravirt ops patching, since the disassembly of the opcode
> bytes shown with the oops message are indicating that the right
> thing is being attempted:
> ff 14 25 10 33 c1 81 callq *0xffffffff81c13310 [so xen_read_cr4 which is native_read_cr4]
> 48 89 c7 mov %rax,%rdi
> 48 81 cf 00 00 04 00 or $0x40000,%rdi
> ff 14 25 18 33 c1 81 callq *0xffffffff81c13318 [so xen_write_cr4] - which is filtering X86_CR4_PGE and X86_CR4_PSE]
> 48 8b 05 0d 15 db 00 mov 0xdb150d(%rip),%rax
> 31 c9 xor %ecx,%ecx
> 48 89 c2 mov %rax,%rdx
> 48 c1 ea 20 shr $0x20,%rdx
> 0f 01 d1 xsetbv
> 5d pop %rbp
> c3 retq
> The primary thing that strikes me as odd is that both calls are still
> indirect ones, even though I thought that they should get replaced
> by direct ones (or even the actual instruction, namely in the
> read_cr4() case) upon first use. Konrad, Jeremy - am I wrong here?
They do get replaced (during runtime - and this is done by the
alternative_instructions). Is this output from objdump or straight
from the memory (so using the Xen debugger?).
> And the dumped %rdi value indicates that bit 18 did _not_ get set.
That would imply that xen_write_cr4, which is just mov to cr4
is getting trapped but somehow the hypervisor isn't setting the
rdi value? Or maybe the the native_write_cr4 ends up
filtering in the wrong order?
0xffffffff8102e650 <xen_write_cr4>: push %rbp
0xffffffff8102e651 <xen_write_cr4+1>: and $0x6f,%dil
0xffffffff8102e655 <xen_write_cr4+5>: mov %rsp,%rbp
0xffffffff8102e658 <xen_write_cr4+8>: mov %rdi,%cr4
0xffffffff8102e65b <xen_write_cr4+11>: leaveq
0xffffffff8102e65c <xen_write_cr4+12>: retq
Xen-devel mailing list
Xen-devel [at] lists