stefano.stabellini at eu
May 20, 2013, 6:44 AM
[Hackathon minutes] PV frontends/backends and NUMA machines
these are my notes from the discussion that we had at the Hackathon
regarding PV frontends and backends running on NUMA machines.
The problem: how can we make sure that frontends and backends run in the
same NUMA node?
We would need to run one backend kthread per NUMA node: we have already
one kthread per netback vif (one per guest), we could pin each of them
on a different NUMA node, the same one the frontend is running on.
But that means that dom0 would be running on several NUMA nodes at once,
how much of a performance penalty would that be?
We would need to export NUMA information to dom0, so that dom0 can make
smart decisions on memory allocations and we would also need to allocate
memory for dom0 from multiple nodes.
We need a way to automatically allocate the initial dom0 memory in Xen
in a NUMA-aware way and we need Xen to automatically create one dom0 vcpu
per NUMA node.
After dom0 boots, the toolstack is going to decide where to place any
new guests: it allocates the memory from the NUMA node it wants to run
the guest on and it is going to ask dom0 to allocate the kthread from
that node too. (Maybe writing the NUMA node on xenstore.)
We need to make sure that the interrupts/MSIs coming from the NIC arrive
on the same pcpu that is running the vcpu that needs to receive it.
We need to do irqbalacing in dom0, then Xen automatically will make the
physical MSIs follow the vcpu automatically.
If the card is multiqueue we need to make sure that we use the multiple
queues so that we can have difference sources of interrupts/MSIs for
each vif. This allows us to independently notify each dom0 vcpu.
Xen-devel mailing list
Xen-devel [at] lists