Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: OpenStack: Dev

[nova] Disk attachment consistency

 

 

OpenStack dev RSS feed   Index | Next | Previous | View Threaded


vishvananda at gmail

Aug 13, 2012, 8:35 PM

Post #1 of 13 (491 views)
Permalink
[nova] Disk attachment consistency

Hey Everyone,

Overview
--------

One of the things that we are striving for in nova is interface consistency, that is, we'd like someone to be able to use an openstack cloud without knowing or caring which hypervisor is running underneath. There is a nasty bit of inconsistency in the way that disks are hot attached to vms that shows through to the user. I've been debating ways to minimize this and I have some issues I need feedback on.

Background
----------

There are three issues contributing to the bad user experience of attaching volumes.

1) The api we present for attaching a volume to an instance has a parameter called device. This is presented as where to attach the disk in the guest.

2) Xen picks minor device numbers on the host hypervisor side and the guest driver follows instructions

3) KVM picks minor device numbers on the guest driver side and doesn't expose them to the host hypervisor side

Resulting Issues
----------------

a) The device name only makes sense for linux. FreeBSD will select different device names, and windows doesn't even use device names. In addition xen uses /dev/xvda and kvm uses /dev/vda

b) The device sent in kvm will not match where it actually shows up. We can consistently guess where it will show up if the guest kernel is >= 3.2, otherwise we are likely to be wrong, and it may change on a reboot anyway


Long term solutions
------------------

We probably shouldn't expose a device path, it should be a device number. This is probably the right change long term, but short term we need to make the device name make sense somehow. I want to delay the long term until after the summit, and come up with something that works short-term with our existing parameters and usage.

The first proposal I have is to make the device parameter optional. The system will automatically generate a valid device name that will be accurate for xen and kvm with guest kernel 3.2, but will likely be wrong for old kvm guests in some situations. I think this is definitely an improvement and only a very minor change to an extension api (making a parameter optional, and returning the generated value of the parameter).

(review at https://review.openstack.org/#/c/10908/)

The second proposal I have is to use a feature of kvm attach and set the device serial number. We can set it to the same value as the device parameter. This means that a device attached to /dev/vdb may not always be at /dev/vdb (with old kvm guests), but it will at least show up at /dev/disk/by-id/virtio-vdb consistently.

(review coming soon)

First question: should we return this magic path somewhere via the api? It would be pretty easy to have horizon generate it but it might be nice to have it show up. If we do return it, do we mangle the device to always show the consistent one, or do we return it as another parameter? guest_device perhaps?

Second question: what should happen if someone specifies /dev/xvda against a kvm cloud or /dev/vda against a xen cloud?
I see two options:
a) automatically convert it to the right value and return it
b) fail with an error message

Third question: what do we do if someone specifies a device value to a kvm cloud that we know will not work. For example the vm has /dev/vda and /dev/vdb and they request an attach at /dev/vdf. In this case we know that it will likely show up at /dev/vdc. I see a few options here and none of them are amazing:

a) let the attach go through as is.
advantages: it will allow scripts to work without having to manually find the next device.
disadvantages: the device name will never be correct in the guest
b) automatically modify the request to attach at /dev/vdc and return it
advantages: the device name will be correct some of the time (kvm guests with newer kernels)
disadvantages: sometimes the name is wrong anyway. The user may not expect the device number to change
c) fail and say, the next disk must be attached at /dev/vdc:
advantages: explicit
disadvantages: painful, incompatible, and the place we say to attach may be incorrect anyway (kvm guests with old kernels)

The second proposal earlier will at least give us a consistent name to find the volume in all these cases, although b) means we have to check the return value to find out what that consistent location is like we do when we don't pass in a device.

I hope everything is clear, but if more explanation is needed please let me know. If anyone has alternative/better proposals please tell me. The last question I think is the most important.

Vish


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


nathanael.i.burton at gmail

Aug 13, 2012, 9:16 PM

Post #2 of 13 (476 views)
Permalink
Re: [openstack-dev] [nova] Disk attachment consistency [In reply to]

On Aug 13, 2012 11:37 PM, "Vishvananda Ishaya" <vishvananda [at] gmail>
wrote:
> The second proposal I have is to use a feature of kvm attach and set the
device serial number. We can set it to the same value as the device
parameter. This means that a device attached to /dev/vdb may not always be
at /dev/vdb (with old kvm guests), but it will at least show up at
/dev/disk/by-id/virtio-vdb consistently.

What about setting the serial number to the volume_id? At least that way
you could be sure it was the volume you meant, especially in the case where
vdb in the guest ends up not being what you requested. What about other
hypervisors?

> (review coming soon)
>
> First question: should we return this magic path somewhere via the api?
It would be pretty easy to have horizon generate it but it might be nice to
have it show up. If we do return it, do we mangle the device to always show
the consistent one, or do we return it as another parameter? guest_device
perhaps?
>
> Second question: what should happen if someone specifies /dev/xvda
against a kvm cloud or /dev/vda against a xen cloud?
> I see two options:
> a) automatically convert it to the right value and return it
> b) fail with an error message
>
> Third question: what do we do if someone specifies a device value to a
kvm cloud that we know will not work. For example the vm has /dev/vda and
/dev/vdb and they request an attach at /dev/vdf. In this case we know that
it will likely show up at /dev/vdc. I see a few options here and none of
them are amazing:
>
> a) let the attach go through as is.
> advantages: it will allow scripts to work without having to manually
find the next device.
> disadvantages: the device name will never be correct in the guest
> b) automatically modify the request to attach at /dev/vdc and return it
> advantages: the device name will be correct some of the time (kvm
guests with newer kernels)
> disadvantages: sometimes the name is wrong anyway. The user may not
expect the device number to change
> c) fail and say, the next disk must be attached at /dev/vdc:
> advantages: explicit
> disadvantages: painful, incompatible, and the place we say to attach
may be incorrect anyway (kvm guests with old kernels)
>
> The second proposal earlier will at least give us a consistent name to
find the volume in all these cases, although b) means we have to check the
return value to find out what that consistent location is like we do when
we don't pass in a device.
>
> I hope everything is clear, but if more explanation is needed please let
me know. If anyone has alternative/better proposals please tell me. The
last question I think is the most important.
>
> Vish
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev [at] lists
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


john.griffith at solidfire

Aug 13, 2012, 9:48 PM

Post #3 of 13 (481 views)
Permalink
Re: [openstack-dev] [nova] Disk attachment consistency [In reply to]

On Mon, Aug 13, 2012 at 10:16 PM, Nathanael Burton
<nathanael.i.burton [at] gmail> wrote:
> On Aug 13, 2012 11:37 PM, "Vishvananda Ishaya" <vishvananda [at] gmail>
> wrote:
>> The second proposal I have is to use a feature of kvm attach and set the
>> device serial number. We can set it to the same value as the device
>> parameter. This means that a device attached to /dev/vdb may not always be
>> at /dev/vdb (with old kvm guests), but it will at least show up at
>> /dev/disk/by-id/virtio-vdb consistently.
>
> What about setting the serial number to the volume_id? At least that way you
> could be sure it was the volume you meant, especially in the case where vdb
> in the guest ends up not being what you requested. What about other
> hypervisors?
>
>> (review coming soon)
>>
>> First question: should we return this magic path somewhere via the api? It
>> would be pretty easy to have horizon generate it but it might be nice to
>> have it show up. If we do return it, do we mangle the device to always show
>> the consistent one, or do we return it as another parameter? guest_device
>> perhaps?
>>
>> Second question: what should happen if someone specifies /dev/xvda against
>> a kvm cloud or /dev/vda against a xen cloud?
>> I see two options:
>> a) automatically convert it to the right value and return it
>> b) fail with an error message
>>
>> Third question: what do we do if someone specifies a device value to a kvm
>> cloud that we know will not work. For example the vm has /dev/vda and
>> /dev/vdb and they request an attach at /dev/vdf. In this case we know that
>> it will likely show up at /dev/vdc. I see a few options here and none of
>> them are amazing:
>>
>> a) let the attach go through as is.
>> advantages: it will allow scripts to work without having to manually
>> find the next device.
>> disadvantages: the device name will never be correct in the guest
>> b) automatically modify the request to attach at /dev/vdc and return it
>> advantages: the device name will be correct some of the time (kvm guests
>> with newer kernels)
>> disadvantages: sometimes the name is wrong anyway. The user may not
>> expect the device number to change
>> c) fail and say, the next disk must be attached at /dev/vdc:
>> advantages: explicit
>> disadvantages: painful, incompatible, and the place we say to attach may
>> be incorrect anyway (kvm guests with old kernels)
>>
>> The second proposal earlier will at least give us a consistent name to
>> find the volume in all these cases, although b) means we have to check the
>> return value to find out what that consistent location is like we do when we
>> don't pass in a device.
>>
>> I hope everything is clear, but if more explanation is needed please let
>> me know. If anyone has alternative/better proposals please tell me. The last
>> question I think is the most important.
>>
>> Vish
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev [at] lists
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev [at] lists
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

I've wondered about using the mount by device-uuid as long term
solutions, ie just mount using libvirt mount by /dev/disk/by-uuid
(don't even take a device parameter). Although I guess there are some
issues here.

As far as my input to your questions:
> What about setting the serial number to the volume_id? At least that way you
> could be sure it was the volume you meant, especially in the case where vdb
> in the guest ends up not being what you requested. What about other
> hypervisors?

+1

>First question: should we return this magic path somewhere via the api? It would be >pretty easy to have horizon generate it but it might be nice to have it show up. If we >do return it, do we mangle the device to always show the consistent one, or do we >return it as another parameter? guest_device perhaps?

I think returning a distinct parameter would be best in this case.

>Second question: what should happen if someone specifies /dev/xvda against a >kvm cloud or /dev/vda against a xen cloud?
>I see two options:
>a) automatically convert it to the right value and return it
>b) fail with an error message

I would vote for option a (auto convert and return)

With respect to the third question:
> (b) automatically modify the request to attach at /dev/vdc and return it)

Seems the best choice we have given a balance between compatibility
and reliability.
The only way to get better reliability given the scenarios creates a
mess compatibility wise in my opinion. I also think that if we add a
field to volume show that includes the *real* path it alleviates some
of the problem here.

John

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


smoser at ubuntu

Aug 14, 2012, 9:00 AM

Post #4 of 13 (487 views)
Permalink
Re: [openstack-dev] [nova] Disk attachment consistency [In reply to]

On Mon, 13 Aug 2012, Vishvananda Ishaya wrote:

> Hey Everyone,
>
> Resulting Issues
> ----------------
>
> a) The device name only makes sense for linux. FreeBSD will select
> different device names, and windows doesn't even use device names. In
> addition xen uses /dev/xvda and kvm uses /dev/vda
>
> b) The device sent in kvm will not match where it actually shows up. We
> can consistently guess where it will show up if the guest kernel is >=
> 3.2, otherwise we are likely to be wrong, and it may change on a reboot
> anyway
>
> Long term solutions
> ------------------
>
> We probably shouldn't expose a device path, it should be a device number. This is probably the right change long term, but short term we need to make the device name make sense somehow. I want to delay the long term until after the summit, and come up with something that works short-term with our existing parameters and usage.
>
> The first proposal I have is to make the device parameter optional. The system will automatically generate a valid device name that will be accurate for xen and kvm with guest kernel 3.2, but will likely be wrong for old kvm guests in some situations. I think this is definitely an improvement and only a very minor change to an extension api (making a parameter optional, and returning the generated value of the parameter).
>
> (review at https://review.openstack.org/#/c/10908/)
>
> The second proposal I have is to use a feature of kvm attach and set the
> device serial number. We can set it to the same value as the device
> parameter. This means that a device attached to /dev/vdb may not always
> be at /dev/vdb (with old kvm guests), but it will at least show up at
> /dev/disk/by-id/virtio-vdb consistently.

This is the right way to do this.
Expose 'serial-number' (or some other name for it) in the API, attach the
device with that serial number and get out of the way.

If the user doesn't provide you one, then create a unique one (at least
for that guest) and return it. For many use cases, a user attaches a
disk, ssh's in, finds the new disk, and uses it. Don't burden them with
coming up with a naming/uuid scheme for this parameter if they dont want
to.

Does xen have anything like this? Can you set the serial number of the
xen block device?

> (review coming soon)
>
> First question: should we return this magic path somewhere via the api?
> It would be pretty easy to have horizon generate it but it might be nice
> to have it show up. If we do return it, do we mangle the device to
> always show the consistent one, or do we return it as another parameter?
> guest_device perhaps?

>From the api perspective, I think it makes most sense to call it what it
is. Don't make any promises or allusions to what the guest OS will do
with it.

> Second question: what should happen if someone specifies /dev/xvda
> against a kvm cloud or /dev/vda against a xen cloud?
> I see two options:
> a) automatically convert it to the right value and return it
> b) fail with an error message

In EC2, this fails with an error message. I think this is more correct.
The one issue here is that you really cannot, and should not attempt to
guess or know what the guest has named devices. Thats why we're we have
this problem in the first place.

So, I dont have strong feelings either way on this. Its broken to pass
'device=' and assume that means something.

> Third question: what do we do if someone specifies a device value to a
> kvm cloud that we know will not work. For example the vm has /dev/vda
> and /dev/vdb and they request an attach at /dev/vdf. In this case we
> know that it will likely show up at /dev/vdc. I see a few options here
> and none of them are amazing:
>
> a) let the attach go through as is.
> advantages: it will allow scripts to work without having to manually find the next device.
> disadvantages: the device name will never be correct in the guest
> b) automatically modify the request to attach at /dev/vdc and return it
> advantages: the device name will be correct some of the time (kvm guests with newer kernels)
> disadvantages: sometimes the name is wrong anyway. The user may not expect the device number to change
> c) fail and say, the next disk must be attached at /dev/vdc:
> advantages: explicit
> disadvantages: painful, incompatible, and the place we say to attach may be incorrect anyway (kvm guests with old kernels)

I vote 'a'.
Just be stupid. Play it simple, don't believe that you can understand what
device naming convention the guest kernel and udev have decided upon.

Heres an example. Do you know what happens if I attach 26 devices?
/dev/vd[a-z], then what? I'm pretty sure it goes to /dev/vd[a-z][a-z],
but its not worth you trying to know that. That convention may not be
followed for xen block devices. At one point (maybe only with scsi
attached disks, letters were never re-used, so an attach, detach, attach
would end up going /dev/vdb, dev/vdc, /dev/vdd).

There is no binary api that linux and udev promise on this, so anything
that is there now is subject to change at any future point. And as you
admitted, the past was different also.

Just dont bother. Let them specify whatever they want, expose in
documentation the "right way" to do this.

> The second proposal earlier will at least give us a consistent name to
> find the volume in all these cases, although b) means we have to check
> the return value to find out what that consistent location is like we do
> when we don't pass in a device.
>
> I hope everything is clear, but if more explanation is needed please let
> me know. If anyone has alternative/better proposals please tell me. The
> last question I think is the most important.

I'm really looking forward to the long term solution.

Thanks for raising this.

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


hzwangpan at corp

Aug 14, 2012, 7:55 PM

Post #5 of 13 (487 views)
Permalink
Re: [nova] Disk attachment consistency [In reply to]

How about using the pci address as the UUID of target devices in one VM?
the pci address is generated by libvirt and we can see it in VM by cmd "ls -la /sys/block/",
and it has no dependency with the kernel version, I can see it in 2.6.32*
when an user attached a disk to VM, we find a free target dev such as vdd(the user doesn't need to assign one) to attach and return the pci address to user,
the disk is consistent when user see it on horizon and in VM(by cmd "ls -la /sys/block/").

libvirt:<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
ls -la /sys/block/:vda -> ../devices/pci-0000:00/0000:00:04.0/virtio1/block/vda

>Hey Everyone,
>
>Overview
>--------
>
>One of the things that we are striving for in nova is interface consistency, that is, we'd like someone to be able to use an openstack cloud without knowing or caring which hypervisor is running underneath. There is a nasty bit of inconsistency in the way that disks are hot attached to vms that shows through to the user. I've been debating ways to minimize this and I have some issues I need feedback on.
>
>Background
>----------
>
>There are three issues contributing to the bad user experience of attaching volumes.
>
>1) The api we present for attaching a volume to an instance has a parameter called device. This is presented as where to attach the disk in the guest.
>
>2) Xen picks minor device numbers on the host hypervisor side and the guest driver follows instructions
>
>3) KVM picks minor device numbers on the guest driver side and doesn't expose them to the host hypervisor side
>
>Resulting Issues
>----------------
>
>a) The device name only makes sense for linux. FreeBSD will select different device names, and windows doesn't even use device names. In addition xen uses /dev/xvda and kvm uses /dev/vda
>
>b) The device sent in kvm will not match where it actually shows up. We can consistently guess where it will show up if the guest kernel is >= 3.2, otherwise we are likely to be wrong, and it may change on a reboot anyway
>
>
>Long term solutions
>------------------
>
>We probably shouldn't expose a device path, it should be a device number. This is probably the right change long term, but short term we need to make the device name make sense somehow. I want to delay the long term until after the summit, and come up with something that works short-term with our existing parameters and usage.
>
>The first proposal I have is to make the device parameter optional. The system will automatically generate a valid device name that will be accurate for xen and kvm with guest kernel 3.2, but will likely be wrong for old kvm guests in some situations. I think this is definitely an improvement and only a very minor change to an extension api (making a parameter optional, and returning the generated value of the parameter).
>
>(review at https://review.openstack.org/#/c/10908/)
>
>The second proposal I have is to use a feature of kvm attach and set the device serial number. We can set it to the same value as the device parameter. This means that a device attached to /dev/vdb may not always be at /dev/vdb (with old kvm guests), but it will at least show up at /dev/disk/by-id/virtio-vdb consistently.
>
>(review coming soon)
>
>First question: should we return this magic path somewhere via the api? It would be pretty easy to have horizon generate it but it might be nice to have it show up. If we do return it, do we mangle the device to always show the consistent one, or do we return it as another parameter? guest_device perhaps?
>
>Second question: what should happen if someone specifies /dev/xvda against a kvm cloud or /dev/vda against a xen cloud?
>I see two options:
>a) automatically convert it to the right value and return it
>b) fail with an error message
>
>Third question: what do we do if someone specifies a device value to a kvm cloud that we know will not work. For example the vm has /dev/vda and /dev/vdb and they request an attach at /dev/vdf. In this case we know that it will likely show up at /dev/vdc. I see a few options here and none of them are amazing:
>
>a) let the attach go through as is.
> advantages: it will allow scripts to work without having to manually find the next device.
> disadvantages: the device name will never be correct in the guest
>b) automatically modify the request to attach at /dev/vdc and return it
> advantages: the device name will be correct some of the time (kvm guests with newer kernels)
> disadvantages: sometimes the name is wrong anyway. The user may not expect the device number to change
>c) fail and say, the next disk must be attached at /dev/vdc:
> advantages: explicit
> disadvantages: painful, incompatible, and the place we say to attach may be incorrect anyway (kvm guests with old kernels)
>
>The second proposal earlier will at least give us a consistent name to find the volume in all these cases, although b) means we have to check the return value to find out what that consistent location is like we do when we don't pass in a device.
>
>I hope everything is clear, but if more explanation is needed please let me know. If anyone has alternative/better proposals please tell me. The last question I think is the most important.
>
>Vish
>


vishvananda at gmail

Aug 14, 2012, 8:05 PM

Post #6 of 13 (486 views)
Permalink
Re: [nova] Disk attachment consistency [In reply to]

On Aug 14, 2012, at 7:55 PM, "Wangpan"<hzwangpan [at] corp> wrote:

> How about using the pci address as the UUID of target devices in one VM?
> the pci address is generated by libvirt and we can see it in VM by cmd "ls -la /sys/block/",
> and it has no dependency with the kernel version, I can see it in 2.6.32*
> when an user attached a disk to VM, we find a free target dev such as vdd(the user doesn't need to assign one) to attach and return the pci address to user,
> the disk is consistent when user see it on horizon and in VM(by cmd "ls -la /sys/block/").
>
> libvirt:<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
> ls -la /sys/block/:vda -> ../devices/pci-0000:00/0000:00:04.0/virtio1/block/vda

This is definitely another solution, although it seems less usable than the device serial number which can be an arbitrary string. If this works for xen though, that would be a plus.

Vish


hzwangpan at corp

Aug 14, 2012, 9:10 PM

Post #7 of 13 (473 views)
Permalink
Re: [nova] Disk attachment consistency [In reply to]

> This is definitely another solution, although it seems less usable than the device serial number which can be an arbitrary string. If this works for xen though, that would be a plus.


> Vish

I don't have a Xen hypervisor in hand, so anybody else can try it on Xen ? thanks


cthier at gmail

Aug 14, 2012, 10:00 PM

Post #8 of 13 (480 views)
Permalink
Re: [openstack-dev] [nova] Disk attachment consistency [In reply to]

Hey Vish,

First, thanks for bringing this up for discussion. Coincidentally a
similar discussion had come up with our teams, but I had pushed it
aside at the time due to time constraints. It is a tricky problem to
solve generally for all hypervisors. See my comments inline:

On Mon, Aug 13, 2012 at 10:35 PM, Vishvananda Ishaya
<vishvananda [at] gmail> wrote:

>
> Long term solutions
> ------------------
>
> We probably shouldn't expose a device path, it should be a device number. This is probably the right change long term, but short term we need to make the device name make sense somehow. I want to delay the long term until after the summit, and come up with something that works short-term with our existing parameters and usage.

I totally agree with delaying the long term discussion, and look
forward to discussing these types of issues more at the summit.

>
> The first proposal I have is to make the device parameter optional. The system will automatically generate a valid device name that will be accurate for xen and kvm with guest kernel 3.2, but will likely be wrong for old kvm guests in some situations. I think this is definitely an improvement and only a very minor change to an extension api (making a parameter optional, and returning the generated value of the parameter).

I could get behind this, and was was brought up by others in our group
as a more feasible short term solution. I have a couple of concerns
with this. It may cause just as much confusion if the api can't
reliably determine which device a volume is attached to. I'm also
curious as to how well this will work with Xen, and hope some of the
citrix folks will chime in. From an api standpoint, I think it would
be fine to make it optional, as any client that is using old api
contract will still work as intended.

>
> (review at https://review.openstack.org/#/c/10908/)
>
> The second proposal I have is to use a feature of kvm attach and set the device serial number. We can set it to the same value as the device parameter. This means that a device attached to /dev/vdb may not always be at /dev/vdb (with old kvm guests), but it will at least show up at /dev/disk/by-id/virtio-vdb consistently.
>
> (review coming soon)
>
> First question: should we return this magic path somewhere via the api? It would be pretty easy to have horizon generate it but it might be nice to have it show up. If we do return it, do we mangle the device to always show the consistent one, or do we return it as another parameter? guest_device perhaps?
>
> Second question: what should happen if someone specifies /dev/xvda against a kvm cloud or /dev/vda against a xen cloud?
> I see two options:
> a) automatically convert it to the right value and return it

I thought that it already did this, but I would have to go back and
double check. But it seemed like for xen at least, if you specify
/dev/vda, Nova would change it to /dev/xvda.

> b) fail with an error message
>

I don't have a strong opinion either way, as long as it is documented
correctly. I would suggest thought that if it has been converting it
in the past, that we continue to do so.

> Third question: what do we do if someone specifies a device value to a kvm cloud that we know will not work. For example the vm has /dev/vda and /dev/vdb and they request an attach at /dev/vdf. In this case we know that it will likely show up at /dev/vdc. I see a few options here and none of them are amazing:
>
> a) let the attach go through as is.
> advantages: it will allow scripts to work without having to manually find the next device.
> disadvantages: the device name will never be correct in the guest
> b) automatically modify the request to attach at /dev/vdc and return it
> advantages: the device name will be correct some of the time (kvm guests with newer kernels)
> disadvantages: sometimes the name is wrong anyway. The user may not expect the device number to change
> c) fail and say, the next disk must be attached at /dev/vdc:
> advantages: explicit
> disadvantages: painful, incompatible, and the place we say to attach may be incorrect anyway (kvm guests with old kernels)

I would choose b, as it tries to get things in the correct state. c
is a bad idea as it would change the overall api behavior, and current
clients wouldn't expect it.

There are also a couple of other interesting tidbits, that may be
related, or at least be worthwhile to know while discussing this.

Xen Server 6.0 has a limit of 16 virtual devices per guest instance.
Experimentally it also expects those to be /dev/xvda - /dev/xvdp. You
can't for example attach a device to /dev/xvdq, even if there are no
other devices attached to the instance. If you attempt to do this,
the volume will go in to the attaching state, fail to attach, and then
fall back to the available state (This can be a bit confusing to new
users who try to do so). Does anyone know if there are similar
limitations for KVM?

Also if you attempt to attach a volume to a deivce that already
exists, it will silently fail and go back to available as well. In
this new scheme should it fail like that, or should it attempt to
attach it to the next available device, or error out? Perhaps a
better question here, is for this initial consistency, is the goal to
try to be consistent just when there is no device sent, or to also be
consistent when the device is sent as well.

There was another idea, also brought up in our group. Would it be
possible to add a call that would return a list of available devices
to be attached to? In the case of Xen, it would return a list of
devices /dev/xvda-p that were not used. In the case of KVM, it would
just return the next available device name. At least in this case,
user interfaces and command line tools could use this to validate the
input the user provides (or auto generate the device to be used if the
user doesn't select a device).

Thanks,

--
Chuck

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


John.Garbutt at citrix

Aug 15, 2012, 7:49 AM

Post #9 of 13 (476 views)
Permalink
Re: [nova] Disk attachment consistency [In reply to]

You can see what XenAPI exposes here:
http://docs.vmd.citrix.com/XenServer/6.0.0/1.0/en_gb/api/?c=VBD

I think the only thing you can influence when plugging in the disk is the “userdevice” which is the disk position: 0,1,2…
When you have attached the disk you can find out the “device” name, such as /dev/xvda

I don't know about Xen with libvirt. But from the previous discussion it seems using the disk position would also work with KVM?

It seems disk position is also suitably OS agnostic, but I may have missed something there.

For backwards compatibility, we could make a "best effort" of translating the specified device name to a position. But as mentioned already, it seems fraught with danger in the general case.

I like the idea of an extra field to help report to the user what the likely device name is, if available. It would allow us to spot when the above "guessing" did not gone the way we had hoped.

Related to this, there is a limitation in Xen (a limitation in the blkbk/blkfrnt protocol I am told) that means you can't cancel the operation of removing a disk from a VM. So if the disk is in use, it may return with an exception saying the disk is in-use, but as soon as the disk can be released, it will be removed anyway. Currently, nova isn't very good at expressing to the user that this is what is happening:
https://bugs.launchpad.net/nova/+bug/1030108

Cheers,
John

> From: openstack-bounces+john.garbutt=citrix.com [at] lists [mailto:openstack-bounces+john.garbutt=citrix.com [at] lists] On Behalf Of Wangpan
> Sent: 15 August 2012 5:11
> To: Vishvananda Ishaya
> Cc: openstack
> Subject: Re: [Openstack] [nova] Disk attachment consistency
> > This is definitely another solution, although it seems less usable than the device serial number which can be an arbitrary string. If this works for xen though, that would be a plus.
> > Vish
> I don't have a Xen hypervisor in hand, so anybody else can try it on Xen ? thanks
_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


berrange at redhat

Aug 15, 2012, 8:16 AM

Post #10 of 13 (468 views)
Permalink
Re: [nova] Disk attachment consistency [In reply to]

On Wed, Aug 15, 2012 at 03:49:45PM +0100, John Garbutt wrote:
> You can see what XenAPI exposes here:
> http://docs.vmd.citrix.com/XenServer/6.0.0/1.0/en_gb/api/?c=VBD
>
> I think the only thing you can influence when plugging in the disk is the “userdevice”
> which is the disk position: 0,1,2… When you have attached the disk you can find out
> the “device” name, such as /dev/xvda
>
> I don't know about Xen with libvirt. But from the previous discussion it seems using
> the disk position would also work with KVM?

No, this doesn't really work in general. Virtio disks get assigned SCSI device
numbers on a first-come first served basis. In the configuration you only have
control over the PCI device slot/function. You might assume that your disks
are probed in PCI device order, and thus get SCSI device numbers in that same
order. This is not really safe though. Further more if the guest has any
other kinds of devices, eg perhaps they logged into an iSCSI target, then all
bets are off for what SCSI device you get assigned.

All the host can safely say is

- Virtio-blk disks get PCI address domain:bus:slot:function
- Virtio-SCSI disks get SCSI address A.B.C.D
- Disks have an unique serial string ZZZZZZZZZZZ

As a guest OS admin you can use this info get reliable disk names
in /dev/disk/by-{path,id}.

If your disk has a filesystem on it, you can also get a unique UUID
and /or filesystem label, which means you can refer to the device
from /dev/disk/by-{uuid,label} too.

Relying on /dev/sdXXX is doomed to failure in the long term, even on
bare metal, and should be avoided wherever possible.

Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


John.Garbutt at citrix

Aug 15, 2012, 9:24 AM

Post #11 of 13 (475 views)
Permalink
Re: [nova] Disk attachment consistency [In reply to]

> From: Daniel P. Berrange [mailto:berrange [at] redhat]
> On Wed, Aug 15, 2012 at 03:49:45PM +0100, John Garbutt wrote:
> > You can see what XenAPI exposes here:
> > http://docs.vmd.citrix.com/XenServer/6.0.0/1.0/en_gb/api/?c=VBD
> >
> > I think the only thing you can influence when plugging in the disk is the
> “userdevice”
> > which is the disk position: 0,1,2… When you have attached the disk
> > you can find out the “device” name, such as /dev/xvda
> >
> > I don't know about Xen with libvirt. But from the previous discussion
> > it seems using the disk position would also work with KVM?
>
> No, this doesn't really work in general. Virtio disks get assigned SCSI device
> numbers on a first-come first served basis. In the configuration you only have
> control over the PCI device slot/function. You might assume that your disks are
> probed in PCI device order, and thus get SCSI device numbers in that same order.
> This is not really safe though. Further more if the guest has any other kinds of
> devices, eg perhaps they logged into an iSCSI target, then all bets are off for
> what SCSI device you get assigned.
>
> All the host can safely say is
>
> - Virtio-blk disks get PCI address domain:bus:slot:function
> - Virtio-SCSI disks get SCSI address A.B.C.D
> - Disks have an unique serial string ZZZZZZZZZZZ
>
> As a guest OS admin you can use this info get reliable disk names in
> /dev/disk/by-{path,id}.
Doh, I guess my plan doesn't work then. After checking, apparently the same problem is also present with how Xen deals with exposing the "position" to the guest VM.

> Relying on /dev/sdXXX is doomed to failure in the long term, even on bare
> metal, and should be avoided wherever possible.
I agree, long term, this is not the way forward. I was just thinking in terms of backwards compatibility.

> If your disk has a filesystem on it, you can also get a unique UUID and /or
> filesystem label, which means you can refer to the device from /dev/disk/by-
> {uuid,label} too.
That sounds the interesting for those attaching volumes that have a file system on it. Would it be reasonable to make this the best practice way for users to discover where the volume has been attached?

Maybe we should simply leave nova to report where the disk has been attached? XenAPI driver can simply attach in the next available slot, and report back what it can about the device location. Sounds like the libvirt driver could do the same?

We could document the current device (or whatever we call it) as a driver specific "hint". Although this doesn't seem very satisfying.

Cheers,
John
_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


vishvananda at gmail

Aug 15, 2012, 12:21 PM

Post #12 of 13 (474 views)
Permalink
Re: [openstack-dev] [nova] Disk attachment consistency [In reply to]

On Aug 14, 2012, at 10:00 PM, Chuck Thier <cthier [at] gmail> wrote:

> <snip>
> I could get behind this, and was was brought up by others in our group
> as a more feasible short term solution. I have a couple of concerns
> with this. It may cause just as much confusion if the api can't
> reliably determine which device a volume is attached to. I'm also
> curious as to how well this will work with Xen, and hope some of the
> citrix folks will chime in. From an api standpoint, I think it would
> be fine to make it optional, as any client that is using old api
> contract will still work as intended.

This will continue to work as well as it currently does with Xen. We can reliably determine which devices are known about and pass back the next one my patch under review below does that
>>
>> (review at https://review.openstack.org/#/c/10908/)
>>
>> <snip>
>> First question: should we return this magic path somewhere via the api? It would be pretty easy to have horizon generate it but it might be nice to have it show up. If we do return it, do we mangle the device to always show the consistent one, or do we return it as another parameter? guest_device perhaps?
>>
>> Second question: what should happen if someone specifies /dev/xvda against a kvm cloud or /dev/vda against a xen cloud?
>> I see two options:
>> a) automatically convert it to the right value and return it
>
> I thought that it already did this, but I would have to go back and
> double check. But it seemed like for xen at least, if you specify
> /dev/vda, Nova would change it to /dev/xvda.

That may be true, I believe for libvirt we just accept /dev/xvdc since it is just interpreted as a label.
>>
>
> <snip>
>
> Xen Server 6.0 has a limit of 16 virtual devices per guest instance.
> Experimentally it also expects those to be /dev/xvda - /dev/xvdp. You
> can't for example attach a device to /dev/xvdq, even if there are no
> other devices attached to the instance. If you attempt to do this,
> the volume will go in to the attaching state, fail to attach, and then
> fall back to the available state (This can be a bit confusing to new
> users who try to do so). Does anyone know if there are similar
> limitations for KVM?

There are no limitations like this AFAIK, however in KVM it is possible
to exhaust virtio minor device numbers by continually detaching and
attaching a device if the guest kernel < 3.2

>
> Also if you attempt to attach a volume to a deivce that already
> exists, it will silently fail and go back to available as well. In
> this new scheme should it fail like that, or should it attempt to
> attach it to the next available device, or error out? Perhaps a
> better question here, is for this initial consistency, is the goal to
> try to be consistent just when there is no device sent, or to also be
> consistent when the device is sent as well.

my review above addresses this by raising an error if you try to attach
to an existing device. I think this is preferrable: i.e. only do the
auto-assign if it is specifically requested.

>
> There was another idea, also brought up in our group. Would it be
> possible to add a call that would return a list of available devices
> to be attached to? In the case of Xen, it would return a list of
> devices /dev/xvda-p that were not used. In the case of KVM, it would
> just return the next available device name. At least in this case,
> user interfaces and command line tools could use this to validate the
> input the user provides (or auto generate the device to be used if the
> user doesn't select a device).

This is definitely a possibility, although it seems like a separate feature.

Vish


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


rich at annexia

Aug 19, 2012, 10:18 AM

Post #13 of 13 (439 views)
Permalink
Re: [nova] Disk attachment consistency [In reply to]

On Mon, Aug 13, 2012 at 08:35:28PM -0700, Vishvananda Ishaya wrote:
> a) The device name only makes sense for linux. FreeBSD will select different device names, and windows doesn't even use device names. In addition xen uses /dev/xvda and kvm uses /dev/vda
>
> b) The device sent in kvm will not match where it actually shows up. We can consistently guess where it will show up if the guest kernel is >= 3.2, otherwise we are likely to be wrong, and it may change on a reboot anyway

Another one -- possibly not a good one, but I'm including it for
completeness -- is that you preformat the disk with a partition and a
filesystem, then use the UUID or LABEL of the filesystem to mount
it[*], ie:

mount UUID=abc-123-456 /data
mount LABEL=os-data-disk /data

Naturally libguestfs can make these performatted disks (see the
'virt-format' tool, or do it through the API).
http://libguestfs.org/virt-format.1.html

Rich.

[*] I'm assuming here that Windows can find and mount filesystems
using the NTFS ID, but I don't know if that's true ...

--
Richard Jones
Red Hat

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp

OpenStack dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.