Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Xen: Devel

Stale mfns in update_queue in XenoLinux, and suspend/resume

 

 

Xen devel RSS feed   Index | Next | Previous | View Threaded


jacob at melon

Apr 4, 2004, 9:02 AM

Post #1 of 2 (59 views)
Permalink
Stale mfns in update_queue in XenoLinux, and suspend/resume

hi,

it seems the suspend code in arch/xen/kernel/setup.c does not flush to
mmu_update queue prior to suspend, and that as a result it may crash
after resumption as a result of stale machine page frame references in
the queue. Is this correct/should this behaviour be fixed? I am
currently investigating a crash in my own migration code, and though I
do flush the queue prior to obtaining a checkpoint, I still seem to be
hit occasionally by stale references somewhere.

If suspension is going to be safe, I guess all uses of machine addresses
should be treated as critical regions, to make sure a suspend/resume
does not happen while they are still in scope? I know this will be
problematic because of the batching of mmu-updates, perhaps it would be
wise to revert to the old behavior of specifying them as virtual
addresses, or maybe they should be converted on the fly, in a cli()
context right before the hypercall?

Jacob



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
https://lists.sourceforge.net/lists/listinfo/xen-devel


Keir.Fraser at cl

Apr 4, 2004, 11:48 PM

Post #2 of 2 (55 views)
Permalink
Re: Stale mfns in update_queue in XenoLinux, and suspend/resume [In reply to]

> it seems the suspend code in arch/xen/kernel/setup.c does not flush to
> mmu_update queue prior to suspend, and that as a result it may crash
> after resumption as a result of stale machine page frame references in
> the queue. Is this correct/should this behaviour be fixed? I am
> currently investigating a crash in my own migration code, and though I
> do flush the queue prior to obtaining a checkpoint, I still seem to be
> hit occasionally by stale references somewhere.
>
> If suspension is going to be safe, I guess all uses of machine addresses
> should be treated as critical regions, to make sure a suspend/resume
> does not happen while they are still in scope? I know this will be
> problematic because of the batching of mmu-updates, perhaps it would be
> wise to revert to the old behavior of specifying them as virtual
> addresses, or maybe they should be converted on the fly, in a cli()
> context right before the hypercall?

Suspend/resume occurs in a process context. Since Xenolinux is
uniprocessor, I think that this should mean that there are no
outstanding page-update requests. Thinking about it, though, it's
possible that interrupt handlers and softirqs may add stuff to teh
update queue. For safety you might want to flush it immediately after
__cli().

-- Keir



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
https://lists.sourceforge.net/lists/listinfo/xen-devel

Xen devel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.