Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: OpenStack: Dev

Cross-zone instance identifiers in EC2 API - Is it worth the effort?

 

 

First page Previous page 1 2 3 4 Next page Last page  View All OpenStack dev RSS feed   Index | Next | Previous | View Threaded


jaypipes at gmail

Jul 6, 2011, 4:35 PM

Post #1 of 97 (38 views)
Permalink
Cross-zone instance identifiers in EC2 API - Is it worth the effort?

Hey all,

Recently, Nova added support for multiple zones in the OpenStack API.
Using the nova-manage tool, you can get a list of instances in a
single zone or in multiple zones using the --recurse option. When just
querying a local zone's API server, the listed instance identifiers
will be integer IDs. When using the --recurse option, the listed
instance identifiers are UUIDs since they are globally unique.

Multiple zones is currently only supported in the OpenStack API, and
the question has been raised whether effort should be expended to get
parity in the EC2 API for this. The problem with the EC2 API is that
we do not have control over the instance identifiers -- they are an 8
character text string. We would still need to map the EC2 instance
identifier to some globally unique identifier (like a UUID), but the
solutions for how to do this aren't pretty (see
http://etherpad.openstack.org/EC2UUID).

Personally, I don't think it is worth expending the effort to get this
feature parity between the OpenStack API and the EC2 API, especially
considering we have little (or no) control over the EC2 API.

Thoughts?
-jay


thierry at openstack

Jul 7, 2011, 1:29 AM

Post #2 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

Jay Pipes wrote:
> Recently, Nova added support for multiple zones in the OpenStack API.
> Using the nova-manage tool, you can get a list of instances in a
> single zone or in multiple zones using the --recurse option. When just
> querying a local zone's API server, the listed instance identifiers
> will be integer IDs. When using the --recurse option, the listed
> instance identifiers are UUIDs since they are globally unique.
>
> Multiple zones is currently only supported in the OpenStack API, and
> the question has been raised whether effort should be expended to get
> parity in the EC2 API for this. The problem with the EC2 API is that
> we do not have control over the instance identifiers -- they are an 8
> character text string. We would still need to map the EC2 instance
> identifier to some globally unique identifier (like a UUID), but the
> solutions for how to do this aren't pretty (see
> http://etherpad.openstack.org/EC2UUID).

We have a spec covering some of this:
https://blueprints.launchpad.net/nova/+spec/ec2-id-compatibilty

It's "Essential", assigned to Soren and targeted to diablo-3. I'd love
to hear his thoughts on this :)

Vish set that spec to "Essential" because the current situation was
supposed to be completely broken, but reading you guys, it appears not
as broken as expected ? Should importance be downgraded to a "nice to
have" ?

--
Thierry Carrez (ttx)
Release Manager, OpenStack


soren at linux2go

Jul 7, 2011, 7:57 AM

Post #3 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

2011/7/7 Jay Pipes <jaypipes [at] gmail>:
> Multiple zones is currently only supported in the OpenStack API, and
> the question has been raised whether effort should be expended to get
> parity in the EC2 API for this. The problem with the EC2 API is that
> we do not have control over the instance identifiers -- they are an 8
> character text string. We would still need to map the EC2 instance
> identifier to some globally unique identifier (like a UUID), but the
> solutions for how to do this aren't pretty (see
> http://etherpad.openstack.org/EC2UUID).

I don't particularly like the idea of maintaining a mapping table. If
such a method is to be guaranteed to function, we need something that
can reliably assign EC2-compatible ID's corresponding to the UUID's
without collisions. If we can come up with such a method anyway, why
use UUID's to begin with?

(For the record, I do believe we *can* come up with such a method. I
raised this point in one of the (several) disucssions we've had on the
subject of ID's, but the ability to assign an unlimited amount of
non-colliding ID's perpetually autonomously took precedence)

I think the only sensible route is truncating (or by other means
reducing) the UUIDs to 32 bits (or revisit (again) the choice of
UUID's, of course).
With a 32 bit key space, a user with 10000 active objects of the same
type (so in the same key space) will have a 1% chance of having
colliding ID's. With ~78000 objects, we're at 50%. I guess that's a
risk we'll have to live with. The tricky part is figuring out how to
handle the collisions when they actually arise.

--
Soren Hansen        | http://linux2go.dk/
Ubuntu Developer    | http://www.ubuntu.com/
OpenStack Developer | http://www.openstack.org/


tve at rightscale

Jul 7, 2011, 8:05 AM

Post #4 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

FYI, there's nothing in the EC2 API that limits instance identifiers (or
other IDs) to 32 bits. The IDs are strings, so it's trivial for EC2 to
add another digit when running out of 32-bit IDs.
TvE

On 7/7/2011 7:57 AM, Soren Hansen wrote:
> 2011/7/7 Jay Pipes <jaypipes [at] gmail>:
>> Multiple zones is currently only supported in the OpenStack API, and
>> the question has been raised whether effort should be expended to get
>> parity in the EC2 API for this. The problem with the EC2 API is that
>> we do not have control over the instance identifiers -- they are an 8
>> character text string. We would still need to map the EC2 instance
>> identifier to some globally unique identifier (like a UUID), but the
>> solutions for how to do this aren't pretty (see
>> http://etherpad.openstack.org/EC2UUID).
> I don't particularly like the idea of maintaining a mapping table. If
> such a method is to be guaranteed to function, we need something that
> can reliably assign EC2-compatible ID's corresponding to the UUID's
> without collisions. If we can come up with such a method anyway, why
> use UUID's to begin with?
>
> (For the record, I do believe we *can* come up with such a method. I
> raised this point in one of the (several) disucssions we've had on the
> subject of ID's, but the ability to assign an unlimited amount of
> non-colliding ID's perpetually autonomously took precedence)
>
> I think the only sensible route is truncating (or by other means
> reducing) the UUIDs to 32 bits (or revisit (again) the choice of
> UUID's, of course).
> With a 32 bit key space, a user with 10000 active objects of the same
> type (so in the same key space) will have a 1% chance of having
> colliding ID's. With ~78000 objects, we're at 50%. I guess that's a
> risk we'll have to live with. The tricky part is figuring out how to
> handle the collisions when they actually arise.
>


vishvananda at gmail

Jul 7, 2011, 8:09 AM

Post #5 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

Yes the issue isn't the actual api definition, the issue is that most of the ec2 clients expect the exact format that amazon currently uses ami-<8 hex chars>.

I think we should move toward ec2 being a compatibility layer that is translated into the os api. This compatibility layer would sit at the top level zone and could maintain its own database for conversion of ids, management of secret and access keys, etc.



On Jul 7, 2011, at 8:05 AM, Thorsten von Eicken wrote:

> FYI, there's nothing in the EC2 API that limits instance identifiers (or
> other IDs) to 32 bits. The IDs are strings, so it's trivial for EC2 to
> add another digit when running out of 32-bit IDs.
> TvE
>
> On 7/7/2011 7:57 AM, Soren Hansen wrote:
>> 2011/7/7 Jay Pipes <jaypipes [at] gmail>:
>>> Multiple zones is currently only supported in the OpenStack API, and
>>> the question has been raised whether effort should be expended to get
>>> parity in the EC2 API for this. The problem with the EC2 API is that
>>> we do not have control over the instance identifiers -- they are an 8
>>> character text string. We would still need to map the EC2 instance
>>> identifier to some globally unique identifier (like a UUID), but the
>>> solutions for how to do this aren't pretty (see
>>> http://etherpad.openstack.org/EC2UUID).
>> I don't particularly like the idea of maintaining a mapping table. If
>> such a method is to be guaranteed to function, we need something that
>> can reliably assign EC2-compatible ID's corresponding to the UUID's
>> without collisions. If we can come up with such a method anyway, why
>> use UUID's to begin with?
>>
>> (For the record, I do believe we *can* come up with such a method. I
>> raised this point in one of the (several) disucssions we've had on the
>> subject of ID's, but the ability to assign an unlimited amount of
>> non-colliding ID's perpetually autonomously took precedence)
>>
>> I think the only sensible route is truncating (or by other means
>> reducing) the UUIDs to 32 bits (or revisit (again) the choice of
>> UUID's, of course).
>> With a 32 bit key space, a user with 10000 active objects of the same
>> type (so in the same key space) will have a 1% chance of having
>> colliding ID's. With ~78000 objects, we're at 50%. I guess that's a
>> risk we'll have to live with. The tricky part is figuring out how to
>> handle the collisions when they actually arise.
>>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp


trey.morris at rackspace

Jul 7, 2011, 8:46 AM

Post #6 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

If I had to choose between dropping or truncating UUIDs and failing feature
parity with the ec2 api, i'd go with the latter. Pros and cons for UUIDs
have already been discussed and decisions made. The EC2 api shouldn't get in
the way. A translation layer to sit in between the EC2 and OS APIs would
solve this issue without revisiting the UUID argument.



On Thu, Jul 7, 2011 at 9:57 AM, Soren Hansen <soren [at] linux2go> wrote:

> 2011/7/7 Jay Pipes <jaypipes [at] gmail>:
> > Multiple zones is currently only supported in the OpenStack API, and
> > the question has been raised whether effort should be expended to get
> > parity in the EC2 API for this. The problem with the EC2 API is that
> > we do not have control over the instance identifiers -- they are an 8
> > character text string. We would still need to map the EC2 instance
> > identifier to some globally unique identifier (like a UUID), but the
> > solutions for how to do this aren't pretty (see
> > http://etherpad.openstack.org/EC2UUID).
>
> I don't particularly like the idea of maintaining a mapping table. If
> such a method is to be guaranteed to function, we need something that
> can reliably assign EC2-compatible ID's corresponding to the UUID's
> without collisions. If we can come up with such a method anyway, why
> use UUID's to begin with?
>
> (For the record, I do believe we *can* come up with such a method. I
> raised this point in one of the (several) disucssions we've had on the
> subject of ID's, but the ability to assign an unlimited amount of
> non-colliding ID's perpetually autonomously took precedence)
>
> I think the only sensible route is truncating (or by other means
> reducing) the UUIDs to 32 bits (or revisit (again) the choice of
> UUID's, of course).
> With a 32 bit key space, a user with 10000 active objects of the same
> type (so in the same key space) will have a 1% chance of having
> colliding ID's. With ~78000 objects, we're at 50%. I guess that's a
> risk we'll have to live with. The tricky part is figuring out how to
> handle the collisions when they actually arise.
>
> --
> Soren Hansen | http://linux2go.dk/
> Ubuntu Developer | http://www.ubuntu.com/
> OpenStack Developer | http://www.openstack.org/
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
>


soren at linux2go

Jul 7, 2011, 8:53 AM

Post #7 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

2011/7/7 Vishvananda Ishaya <vishvananda [at] gmail>:
> I think we should move toward ec2 being a compatibility layer that is translated into the os api.  This compatibility layer would sit at the top level zone and could maintain its own database for conversion of ids, management of secret and access keys, etc.

With all due respect, I think this is a terrible idea. From a
technical perspective, a backend that is flexible enough to support
both the EC2 and the OpenStack (and OCCI and vCloud and whatever else)
APIs without translation layers is a good thing and helps keep the
separation clean.

From an adoption perspective, like it or not, EC2 is popular. Lots of
people use it and are comfortable and familiar with its API. I don't
see what we'll win by so thoroughly reducing the EC2 API to a second
class citizen in Nova.

--
Soren Hansen        | http://linux2go.dk/
Ubuntu Developer    | http://www.ubuntu.com/
OpenStack Developer | http://www.openstack.org/


tve at rightscale

Jul 7, 2011, 9:08 AM

Post #8 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

On 7/7/2011 8:53 AM, Soren Hansen wrote:
> 2011/7/7 Vishvananda Ishaya <vishvananda [at] gmail>:
>> I think we should move toward ec2 being a compatibility layer that is translated into the os api. This compatibility layer would sit at the top level zone and could maintain its own database for conversion of ids, management of secret and access keys, etc.
> With all due respect, I think this is a terrible idea. From a
> technical perspective, a backend that is flexible enough to support
> both the EC2 and the OpenStack (and OCCI and vCloud and whatever else)
> APIs without translation layers is a good thing and helps keep the
> separation clean.
>
> From an adoption perspective, like it or not, EC2 is popular. Lots of
> people use it and are comfortable and familiar with its API. I don't
> see what we'll win by so thoroughly reducing the EC2 API to a second
> class citizen in Nova.
I totally agree. I understand the issues around "an API we don't
control" but the EC2 API has stood the test of time and a LOT of
evolution. I have yet to work with a cloud API that comes anywhere close
to it. There's a lot of benefit in treating it as a first class citizen
in the OpenStack project (or at least as a 1.5th class citizen). Yes, it
will cost some extra discussions, but that's worth it. The native API
may one day make the EC2 one uninteresting, but that day is not yet on
the horizon. I hope I'm not offending anyone, just trying to look at the
reality. And yes, we are targeting the native API in our OpenStack support.
Thorsten


trey.morris at rackspace

Jul 7, 2011, 9:37 AM

Post #9 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

The goal isn't for ec2 api to be a "second class citizen", but to keep it
from being a limiting factor since we don't have control over it. How does
the compatibility layer make it second class?

On Thu, Jul 7, 2011 at 10:53 AM, Soren Hansen <soren [at] linux2go> wrote:

> 2011/7/7 Vishvananda Ishaya <vishvananda [at] gmail>:
> > I think we should move toward ec2 being a compatibility layer that is
> translated into the os api. This compatibility layer would sit at the top
> level zone and could maintain its own database for conversion of ids,
> management of secret and access keys, etc.
>
> With all due respect, I think this is a terrible idea. From a
> technical perspective, a backend that is flexible enough to support
> both the EC2 and the OpenStack (and OCCI and vCloud and whatever else)
> APIs without translation layers is a good thing and helps keep the
> separation clean.
>
> From an adoption perspective, like it or not, EC2 is popular. Lots of
> people use it and are comfortable and familiar with its API. I don't
> see what we'll win by so thoroughly reducing the EC2 API to a second
> class citizen in Nova.
>
> --
> Soren Hansen | http://linux2go.dk/
> Ubuntu Developer | http://www.ubuntu.com/
> OpenStack Developer | http://www.openstack.org/
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
>


soren at linux2go

Jul 7, 2011, 11:35 AM

Post #10 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

2011/7/7 Trey Morris <trey.morris [at] rackspace>:
> The goal isn't for ec2 api to be a "second class citizen", but to keep it
> from being a limiting factor since we don't have control over it. How does
> the compatibility layer make it second class?

Well, for one thing because you'll be limiting the EC2 API to what the
OpenStack API is capable of doing and/or expressing.

--
Soren Hansen        | http://linux2go.dk/
Ubuntu Developer    | http://www.ubuntu.com/
OpenStack Developer | http://www.openstack.org/


vishvananda at gmail

Jul 7, 2011, 12:15 PM

Post #11 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

On Jul 7, 2011, at 11:35 AM, Soren Hansen wrote:

> 2011/7/7 Trey Morris <trey.morris [at] rackspace>:
>> The goal isn't for ec2 api to be a "second class citizen", but to keep it
>> from being a limiting factor since we don't have control over it. How does
>> the compatibility layer make it second class?
>
> Well, for one thing because you'll be limiting the EC2 API to what the
> OpenStack API is capable of doing and/or expressing.
>

Actually, we can add any code that is necessary to make the ec2 api work properly as extensions to the os api. My main reason for suggesting the switch over to a compatibility layer is so that we don't have multiple code paths into the heart of nova. This has been a pain point already, and it will only get more painful as we move forward. If all the stuff that makes ec2 compatibility work is in one place at the top layer, it is easier to maintain.

If we need to maintain entirely disparate code paths from the api all the way down to the hypervisor, things will continue to be very fragile. It means we have to make the ec2 api work across zones. It actually doesn't matter to me particularly if we are speaking to the rest api or some sort of internal api, but we are currently exposing the os api for multiple zones so it seems like we are moving in that direction.

Vish


Ewan.Mellor at eu

Jul 7, 2011, 5:49 PM

Post #12 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

> From Thorsten von Eicken
>
> FYI, there's nothing in the EC2 API that limits instance identifiers
> (or other IDs) to 32 bits. The IDs are strings, so it's trivial for EC2 to
> add another digit when running out of 32-bit IDs.

If that's the case (and I believe you, that's always how I assumed it would be) why don't we just make the EC2 ID be ami-<uuid>?

Ewan.


george.reese at enstratus

Jul 8, 2011, 5:11 AM

Post #13 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

I would just like to re-iterate that I think the entire UUID approach is flawed and issues like this are one of the key reasons why.

-George

On Jul 8, 2011, at 7:02 AM, Soren Hansen wrote:

> 2011/7/8 Ed Leafe <ed.leafe [at] rackspace>:
>> On Jul 7, 2011, at 11:46 AM, Trey Morris wrote:
>>> If I had to choose between dropping or truncating UUIDs and failing feature parity with the ec2 api, i'd go with the latter. Pros and cons for UUIDs have already been discussed and decisions made. The EC2 api shouldn't get in the way. A translation layer to sit in between the EC2 and OS APIs would solve this issue without revisiting the UUID argument.
>> The code to use the first 8 chars of the UUID in the ec2 id was created and working well, but discarded in favor of a more limited approach.
>
> Why?
>
>> The only issue was the increased likelihood of a duplicate ec2 id, as we'd be limited to only 4 billion of them or so. I thought that it would be fairly straightforward to add code to detect such dupes, and re-generate a new UUID for the instance in that event.
>
> How would that work? The frontend gets a reqeust for a new instance,
> it sends it on to the backend that starts handling the request and
> sends back the ID. What then? Either the backend would need to be
> changed to wait for an "yes, I accept this UUID" from the frontend,
> which is unfortunate, or the frontend would have to get the response
> back, go "hm, this collides with an existing ID. Cancel the request,
> and start over." In the latter case, a cost has already been incurred
> by starting up and immediately shutting down the instance.
>
> Also, if instances had already been created through the OpenStack API
> and were being queried through the EC2 API, there's no guarantee that
> colliding ID's don't already exist. In that case, you need to figure
> how to try to make sure that a request asking for that particular ID
> always gives a response corresponding to the same actual instance.
> This sounds crappy for a distributed system. If you want to perform
> the collision check even for intances created through the OpenStack
> API, then the reasons for choosing UUID's to begin with are moot.
>
> --
> Soren Hansen | http://linux2go.dk/
> Ubuntu Developer | http://www.ubuntu.com/
> OpenStack Developer | http://www.openstack.org/
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp

--
George Reese - Chief Technology Officer, enStratus
e: george.reese [at] enstratus t: @GeorgeReese p: +1.207.956.0217 f: +1.612.338.5041
enStratus: Governance for Public, Private, and Hybrid Clouds - @enStratus - http://www.enstratus.com
To schedule a meeting with me: http://tungle.me/GeorgeReese
Attachments: smime.p7s (4.29 KB)


soren at linux2go

Jul 8, 2011, 5:13 AM

Post #14 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

2011/7/8 Ewan Mellor <Ewan.Mellor [at] eu>:
>> From Thorsten von Eicken
>> FYI, there's nothing in the EC2 API that limits instance identifiers
>> (or other IDs) to 32 bits. The IDs are strings, so it's trivial for EC2 to
>> add another digit when running out of 32-bit IDs.
> If that's the case (and I believe you, that's always how I assumed it would be) why don't we just make the EC2 ID be ami-<uuid>?

Vish already explained it in this thread, but the problem is that
while the spec just says it's a string, most clients have much
stricter expectations.

ElasticFox (which Amazon develops themselves) requires e.g. an AMI id
to match ^ami-[0-9a-f]{8}$. I also expect most tools that use a
database store these ID's use a VARCHAR(XXX) (or perhaps even a
CHAR(XXX)). Having a max length at all is wrong according to the spec
(which says it's just a string).

The whole point of supporting the EC2 API is to support people's
existing tools and whatnot. If we follow the spec, but the clients
don't work, we're doing it wrong.


--
Soren Hansen        | http://linux2go.dk/
Ubuntu Developer    | http://www.ubuntu.com/
OpenStack Developer | http://www.openstack.org/


Ewan.Mellor at eu

Jul 8, 2011, 5:20 AM

Post #15 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

> [Snip]
>
> The whole point of supporting the EC2 API is to support people's
> existing tools and whatnot. If we follow the spec, but the clients
> don't work, we're doing it wrong.

True enough. However, in the case where we've got a demonstrated divergence from the spec, we should report that against the client. I agree that we want to support pre-existing tools, but it's less clear that we want to support _buggy_ tools.

If Amazon turn out to be resistant to fixing that problem, then we'll obviously have to accept that and move on, but we should at least give them a chance to respond on that.

Cheers,

Ewan.


soren at linux2go

Jul 8, 2011, 5:39 AM

Post #16 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

2011/7/8 Ewan Mellor <Ewan.Mellor [at] eu>:
>> The whole point of supporting the EC2 API is to support people's
>> existing tools and whatnot. If we follow the spec, but the clients
>> don't work, we're doing it wrong.
> True enough.  However, in the case where we've got a demonstrated divergence from the spec, we should report that against the client.  I agree that we want to support pre-existing tools, but it's less clear that we want to support _buggy_ tools.

We do. We have to. We have no way to know what sort of clients people
are using. We can only attempt to check the open source ones, but
there's likely loads of other ones that people have built themselves
and never shared. Not only do people have to be able, motivated and
allowed to change their tools to work with OpenStack, they also need
to *realise* that this is something that needs to happen. We can't
assume the symptoms they'll experience even gives as much as a hint
that the ID's they're getting back is too long. They may just get a
general error of some sort.

If we a) expect people to consume the EC2 API we expose, and (more
importantly) b) expect ISP's to offer this API to their customers, it
needs to be as close to "just another EC2 region" as possible.

> If Amazon turn out to be resistant to fixing that problem, then we'll obviously have to accept that and move on, but we should at least give them a chance to respond on that.

Amazon is not the problem. At least not the only problem. I'm not even
going to begin to guess how many different tools exist to talk to the
EC2 API.

--
Soren Hansen        | http://linux2go.dk/
Ubuntu Developer    | http://www.ubuntu.com/
OpenStack Developer | http://www.openstack.org/


soren at linux2go

Jul 8, 2011, 6:06 AM

Post #17 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

2011/7/7 Vishvananda Ishaya <vishvananda [at] gmail>:
> On Jul 7, 2011, at 11:35 AM, Soren Hansen wrote:
>> 2011/7/7 Trey Morris <trey.morris [at] rackspace>:
>>> The goal isn't for ec2 api to be a "second class citizen", but to keep it
>>> from being a limiting factor since we don't have control over it. How does
>>> the compatibility layer make it second class?
>> Well, for one thing because you'll be limiting the EC2 API to what the
>> OpenStack API is capable of doing and/or expressing.
> Actually, we can add any code that is necessary to make the ec2 api work properly as extensions to the os api.

Maybe it's functionality that we don't want anywhere near the OpenStack API?

> My main reason for suggesting the switch over to a compatibility layer is so that we don't have multiple code paths into the heart of nova.

Both frontend types will be speaking to the same backend API. That's
where we translate the user's requests into actions that the backend
can perform. This sounds like a very reasonable split to me. I don't
understand why we'd change that. Building a backend that is reasonably
frontend-API-agnostic is a good thing. It keeps us honest and helps
maintain a clear separation of the various components. Eric did a lot
of excellent work to make this split cleaner. I'd hate it if we
blurred the lines between frontend and backend again.

> This has been a pain point already, and it will only get more painful as we move forward.

IMO, reworking things to achieve looser coupling is tricky and
painful, but time well spent.

> If all the stuff that makes ec2 compatibility work is in one place at the top layer, it is easier to maintain.

I'll see your unsubstantiated claim, and raise you another one: It'll
simply make other parts harder to maintain.

> If we need to maintain entirely disparate code paths from the api all the way down to the hypervisor, things will continue to be very fragile.

It's not at all down the hypervisor. Not even close. The (currently)
two different API frontends will interact with the relevant services
(network, compute or scheduler), all of which should have a
well-defined, flexible API. So far, these API's have been extended on
a very ad-hoc basis to support some new functionality in either of the
frontend API's. Just to be clear: I'm guilty of this too! What we
should be doing is defining sensiblem coherent API's for these backend
services, document their interfaces and treat them with respect, be
aware when/how they change, attempt to maintain backwards/forwards
compatiblity. When we start to think about how to upgrade Nova
deployments, this will be very helpful. Having different things
talking to the same API's keeps us more honest in this regard.

So, I feel the exact opposite. A lot of the current fragility of Nova
stems from the fact that we've focused a lot on specific features and
not very much on these grander, more holistic things. When something
is tightly coupled, and you try to bolt something else onto it, yes,
it becomes fragile. Not because you're bolting things on, but because
you didn't consider early enough that other things might be bolted on
further down the road.

--
Soren Hansen        | http://linux2go.dk/
Ubuntu Developer    | http://www.ubuntu.com/
OpenStack Developer | http://www.openstack.org/


jaypipes at gmail

Jul 8, 2011, 6:58 AM

Post #18 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

On Fri, Jul 8, 2011 at 9:54 AM, Soren Hansen <soren [at] linux2go> wrote:
> 2011/7/8 Ed Leafe <ed.leafe [at] rackspace>:
>>        No, it would work more like: a new instance is requested, and the host selected. A candidate UUID would be generated and checked for "first 8" uniqueness (I had already added a db method to locate by the first 8 chars of a UUID across nested zones). When an acceptable UUID was generated, it would be passed to the selected host along with the create request. The instance would only have to be created once.
>
> If we're doing collision checking anyway, using UUID's to being with
> is pointless. We're effectively reduced to a 32 bit key space and,
> even worse, we're not being smart enough about it to actually do
> without the extra DB roundtrip to check for collisions.

That may be true, but only for the EC2 API, not the OpenStack API,
right? The question remains from my original post: do we really care
enough about this?

Put another way: do we want to expend more resources on an API we don't control?

-jay


sandy.walsh at RACKSPACE

Jul 8, 2011, 7:36 AM

Post #19 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

+1 to Soren's argument that ec2 is the 1000lb gorilla and should be central to nova. We definitely need to support it with as close to 100% compatibility as we can.

Sounds like the only option is to "embrace and extend" it. Do everything it can do, and layer on what we need provided it doesn't break the core EC2 commands. If the customer wants pure EC2, they'll have to live with the limitations.

That said, is this the proposal I'm hearing? ...

Since our separation is done at the service.api layer, the service.api's get pulled in two directions with each change in ec2/os. The idea is to have ec2 be a translation layer to os api? Preventing ec2 api from calling [service].api directly?

So, instead of

EC2 Client OS Client
| |
EC2 API OS API
\ /
[Service] API

We'd be shifting to:

EC2 Client ---- EC2 API
|
OS Client ------ OS API
|
[Service] API

I need to think more about this, but at first blush, it doesn't seem like such a thing? At some point the abstraction layer needs to be locked down doesn't it?

-S

> Put another way: do we want to expend more resources on an API we don't control?
>
> -jay
This email may include confidential information. If you received it in error, please delete it.


sandy.walsh at RACKSPACE

Jul 8, 2011, 7:53 AM

Post #20 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

Ugh ...

"... but at first blush, it doesn't seem like such a *bad* thing?"
This email may include confidential information. If you received it in error, please delete it.


jorge.williams at rackspace

Jul 8, 2011, 8:01 AM

Post #21 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

I'm with Ewan on this point: One of the nice thing about having a contract is that it clearly designates what's a bug and what isn't. If the spec says the ID is a string and the client assumes it's an integer, then the client is at fault. End of story. It would be a different issue if the contract didn't specify what an ID was or if the contract only allowed for integers.

It's bad enough that we are spending resources trying to support an API which isn't open and which we don't control, now on top of that we want to support buggy clients that don't follow the spec? Where do we draw the line? I'm all for being flexible and forgiving in what we expect from clients, but I don't think we should be making serious engineering decisions based on the fact that a client developer made a bad assumption or didn't read the spec.

If we know that there are clients out there that make the assumptions then contact the folks that maintain the client and ask them to adjust their code. If they give you grief, point to the contract and that should settle the issue. It's to their interest to support as many deployments of the API as possible. It's not our responsibility to support their buggy code.

Though I have some reservations about it, I'm okay offering some support for the EC2 contract. What I'm not okay with is in being in the business of reverse engineering Amazon's EC2 implementation. Those are two very different things and I think the latter is orders of magnitude more difficult. In fact I would argue that reverse engineering EC2 is a project onto itself. That means that when EC2 has a bug, we need to replicate it etc. That's almost impossible and it makes it really easy for Amazon to disrupt our efforts if they so wish. What's more, it gets in the way of our ability to innovate and break new ground.

-jOrGe W.

On Jul 8, 2011, at 7:39 AM, Soren Hansen wrote:

> 2011/7/8 Ewan Mellor <Ewan.Mellor [at] eu>:
>>> The whole point of supporting the EC2 API is to support people's
>>> existing tools and whatnot. If we follow the spec, but the clients
>>> don't work, we're doing it wrong.
>> True enough. However, in the case where we've got a demonstrated divergence from the spec, we should report that against the client. I agree that we want to support pre-existing tools, but it's less clear that we want to support _buggy_ tools.
>
> We do. We have to. We have no way to know what sort of clients people
> are using. We can only attempt to check the open source ones, but
> there's likely loads of other ones that people have built themselves
> and never shared. Not only do people have to be able, motivated and
> allowed to change their tools to work with OpenStack, they also need
> to *realise* that this is something that needs to happen. We can't
> assume the symptoms they'll experience even gives as much as a hint
> that the ID's they're getting back is too long. They may just get a
> general error of some sort.
>
> If we a) expect people to consume the EC2 API we expose, and (more
> importantly) b) expect ISP's to offer this API to their customers, it
> needs to be as close to "just another EC2 region" as possible.
>
>> If Amazon turn out to be resistant to fixing that problem, then we'll obviously have to accept that and move on, but we should at least give them a chance to respond on that.
>
> Amazon is not the problem. At least not the only problem. I'm not even
> going to begin to guess how many different tools exist to talk to the
> EC2 API.
>
> --
> Soren Hansen | http://linux2go.dk/
> Ubuntu Developer | http://www.ubuntu.com/
> OpenStack Developer | http://www.openstack.org/
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp

This email may include confidential information. If you received it in error, please delete it.


jaypipes at gmail

Jul 8, 2011, 8:01 AM

Post #22 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

On Fri, Jul 8, 2011 at 10:36 AM, Sandy Walsh <sandy.walsh [at] rackspace> wrote:
> So, instead of
>
> EC2 Client   OS Client
>   |             |
> EC2 API        OS API
>    \           /
>   [Service] API
>
> We'd be shifting to:
>
> EC2 Client ---- EC2 API
>                        |
> OS Client ------ OS API
>                        |
>                 [Service] API

Good way of thinking about the proposed solution from Vish, yes.

-jay


sandy.walsh at RACKSPACE

Jul 8, 2011, 8:23 AM

Post #23 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

I don't think this is a technical issue, it's a business issue. If we want adoption, we have to reduce switching friction. Sadly, this means EC2 bugs/nuances and all.

The better a job we do of this, the easier it will be for users to transition from EC2 to OpenStack and benefit from all the chocolatey goodness we're baking into it.

$0.02


________________________________________
From: openstack-bounces+sandy.walsh=rackspace.com [at] lists [openstack-bounces+sandy.walsh=rackspace.com [at] lists] on behalf of Jorge Williams [jorge.williams [at] rackspace]
Sent: Friday, July 08, 2011 12:01 PM
To: Soren Hansen
Cc: Ewan Mellor; openstack [at] lists
Subject: Re: [Openstack] Cross-zone instance identifiers in EC2 API - Is it worth the effort?

I'm with Ewan on this point: One of the nice thing about having a contract is that it clearly designates what's a bug and what isn't. If the spec says the ID is a string and the client assumes it's an integer, then the client is at fault. End of story. It would be a different issue if the contract didn't specify what an ID was or if the contract only allowed for integers.

It's bad enough that we are spending resources trying to support an API which isn't open and which we don't control, now on top of that we want to support buggy clients that don't follow the spec? Where do we draw the line? I'm all for being flexible and forgiving in what we expect from clients, but I don't think we should be making serious engineering decisions based on the fact that a client developer made a bad assumption or didn't read the spec.

If we know that there are clients out there that make the assumptions then contact the folks that maintain the client and ask them to adjust their code. If they give you grief, point to the contract and that should settle the issue. It's to their interest to support as many deployments of the API as possible. It's not our responsibility to support their buggy code.

Though I have some reservations about it, I'm okay offering some support for the EC2 contract. What I'm not okay with is in being in the business of reverse engineering Amazon's EC2 implementation. Those are two very different things and I think the latter is orders of magnitude more difficult. In fact I would argue that reverse engineering EC2 is a project onto itself. That means that when EC2 has a bug, we need to replicate it etc. That's almost impossible and it makes it really easy for Amazon to disrupt our efforts if they so wish. What's more, it gets in the way of our ability to innovate and break new ground.

-jOrGe W.

On Jul 8, 2011, at 7:39 AM, Soren Hansen wrote:

> 2011/7/8 Ewan Mellor <Ewan.Mellor [at] eu>:
>>> The whole point of supporting the EC2 API is to support people's
>>> existing tools and whatnot. If we follow the spec, but the clients
>>> don't work, we're doing it wrong.
>> True enough. However, in the case where we've got a demonstrated divergence from the spec, we should report that against the client. I agree that we want to support pre-existing tools, but it's less clear that we want to support _buggy_ tools.
>
> We do. We have to. We have no way to know what sort of clients people
> are using. We can only attempt to check the open source ones, but
> there's likely loads of other ones that people have built themselves
> and never shared. Not only do people have to be able, motivated and
> allowed to change their tools to work with OpenStack, they also need
> to *realise* that this is something that needs to happen. We can't
> assume the symptoms they'll experience even gives as much as a hint
> that the ID's they're getting back is too long. They may just get a
> general error of some sort.
>
> If we a) expect people to consume the EC2 API we expose, and (more
> importantly) b) expect ISP's to offer this API to their customers, it
> needs to be as close to "just another EC2 region" as possible.
>
>> If Amazon turn out to be resistant to fixing that problem, then we'll obviously have to accept that and move on, but we should at least give them a chance to respond on that.
>
> Amazon is not the problem. At least not the only problem. I'm not even
> going to begin to guess how many different tools exist to talk to the
> EC2 API.
>
> --
> Soren Hansen | http://linux2go.dk/
> Ubuntu Developer | http://www.ubuntu.com/
> OpenStack Developer | http://www.openstack.org/
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp

This email may include confidential information. If you received it in error, please delete it.


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp
This email may include confidential information. If you received it in error, please delete it.


jorge.williams at rackspace

Jul 8, 2011, 9:15 AM

Post #24 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

My $0.02 is that this puts us in a very vulnerable position. Amazon can introduce bugs/nuances at will which can have serious implications for us -- if they want to disrupt our efforts it's very easy for them to do so. Reverse engineering Amazon's EC2 can also take a serious amount of effort -- we need to be continuously testing against it etc. Are we doing this now?

When Rackspace makes the switch you will automagically have a lot of clients with very good incentive to use our API and to give local deployments of OpenStack a try. I'm not convinced that we *need* EC2 in the long run precisely because of this.

-jOrGe W.

On Jul 8, 2011, at 10:23 AM, Sandy Walsh wrote:

> I don't think this is a technical issue, it's a business issue. If we want adoption, we have to reduce switching friction. Sadly, this means EC2 bugs/nuances and all.
>
> The better a job we do of this, the easier it will be for users to transition from EC2 to OpenStack and benefit from all the chocolatey goodness we're baking into it.
>
> $0.02
>
>
> ________________________________________
> From: openstack-bounces+sandy.walsh=rackspace.com [at] lists [openstack-bounces+sandy.walsh=rackspace.com [at] lists] on behalf of Jorge Williams [jorge.williams [at] rackspace]
> Sent: Friday, July 08, 2011 12:01 PM
> To: Soren Hansen
> Cc: Ewan Mellor; openstack [at] lists
> Subject: Re: [Openstack] Cross-zone instance identifiers in EC2 API - Is it worth the effort?
>
> I'm with Ewan on this point: One of the nice thing about having a contract is that it clearly designates what's a bug and what isn't. If the spec says the ID is a string and the client assumes it's an integer, then the client is at fault. End of story. It would be a different issue if the contract didn't specify what an ID was or if the contract only allowed for integers.
>
> It's bad enough that we are spending resources trying to support an API which isn't open and which we don't control, now on top of that we want to support buggy clients that don't follow the spec? Where do we draw the line? I'm all for being flexible and forgiving in what we expect from clients, but I don't think we should be making serious engineering decisions based on the fact that a client developer made a bad assumption or didn't read the spec.
>
> If we know that there are clients out there that make the assumptions then contact the folks that maintain the client and ask them to adjust their code. If they give you grief, point to the contract and that should settle the issue. It's to their interest to support as many deployments of the API as possible. It's not our responsibility to support their buggy code.
>
> Though I have some reservations about it, I'm okay offering some support for the EC2 contract. What I'm not okay with is in being in the business of reverse engineering Amazon's EC2 implementation. Those are two very different things and I think the latter is orders of magnitude more difficult. In fact I would argue that reverse engineering EC2 is a project onto itself. That means that when EC2 has a bug, we need to replicate it etc. That's almost impossible and it makes it really easy for Amazon to disrupt our efforts if they so wish. What's more, it gets in the way of our ability to innovate and break new ground.
>
> -jOrGe W.
>
> On Jul 8, 2011, at 7:39 AM, Soren Hansen wrote:
>
>> 2011/7/8 Ewan Mellor <Ewan.Mellor [at] eu>:
>>>> The whole point of supporting the EC2 API is to support people's
>>>> existing tools and whatnot. If we follow the spec, but the clients
>>>> don't work, we're doing it wrong.
>>> True enough. However, in the case where we've got a demonstrated divergence from the spec, we should report that against the client. I agree that we want to support pre-existing tools, but it's less clear that we want to support _buggy_ tools.
>>
>> We do. We have to. We have no way to know what sort of clients people
>> are using. We can only attempt to check the open source ones, but
>> there's likely loads of other ones that people have built themselves
>> and never shared. Not only do people have to be able, motivated and
>> allowed to change their tools to work with OpenStack, they also need
>> to *realise* that this is something that needs to happen. We can't
>> assume the symptoms they'll experience even gives as much as a hint
>> that the ID's they're getting back is too long. They may just get a
>> general error of some sort.
>>
>> If we a) expect people to consume the EC2 API we expose, and (more
>> importantly) b) expect ISP's to offer this API to their customers, it
>> needs to be as close to "just another EC2 region" as possible.
>>
>>> If Amazon turn out to be resistant to fixing that problem, then we'll obviously have to accept that and move on, but we should at least give them a chance to respond on that.
>>
>> Amazon is not the problem. At least not the only problem. I'm not even
>> going to begin to guess how many different tools exist to talk to the
>> EC2 API.
>>
>> --
>> Soren Hansen | http://linux2go.dk/
>> Ubuntu Developer | http://www.ubuntu.com/
>> OpenStack Developer | http://www.openstack.org/
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack
>> Post to : openstack [at] lists
>> Unsubscribe : https://launchpad.net/~openstack
>> More help : https://help.launchpad.net/ListHelp
>
> This email may include confidential information. If you received it in error, please delete it.
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp

This email may include confidential information. If you received it in error, please delete it.


lorin at isi

Jul 8, 2011, 10:29 AM

Post #25 of 97 (37 views)
Permalink
Re: Cross-zone instance identifiers in EC2 API - Is it worth the effort? [In reply to]

On Jul 8, 2011, at 10:36 AM, Sandy Walsh wrote:

> +1 to Soren's argument that ec2 is the 1000lb gorilla and should be central to nova. We definitely need to support it with as close to 100% compatibility as we can.
>
> Sounds like the only option is to "embrace and extend" it. Do everything it can do, and layer on what we need provided it doesn't break the core EC2 commands. If the customer wants pure EC2, they'll have to live with the limitations.
>
> That said, is this the proposal I'm hearing? ...
>
> Since our separation is done at the service.api layer, the service.api's get pulled in two directions with each change in ec2/os. The idea is to have ec2 be a translation layer to os api? Preventing ec2 api from calling [service].api directly?
>
> So, instead of
>
> EC2 Client OS Client
> | |
> EC2 API OS API
> \ /
> [Service] API
>
> We'd be shifting to:
>
> EC2 Client ---- EC2 API
> |
> OS Client ------ OS API
> |
> [Service] API
>
> I need to think more about this, but at first blush, it doesn't seem like such a [bad] thing? At some point the abstraction layer needs to be locked down doesn't it?
>

I think it actually looks more like this right now:


EC2 Client OS Client
| |
EC2 API OS API
\ /
[nova-*] service APIs

There isn't really a single back-end API for the front-end APIs to call into. Instead, each of them makes calls to the multiple service APIs (e.g., scheduler, network, compute).

I would advocate for something more like this:


EC2 Client OS Client
| |
EC2 API OS API
\ /
internal nova API
|
[nova-*] service APIs


This is a single, unified API that is meant only for internal use. This would reduce the coupling between front-end and back-end. It would make it easier for someone with less expertise in the code (hello!) to find the location in the code that answers questions like: "What does nova do when a user requests that an instance is launched?" They would just look at the internal API and find the appropriate method. It would also make it easier to add additional front-ends, if there's ever any interest in that.


Lorin
--
Lorin Hochstein, Computer Scientist
USC Information Sciences Institute
703.812.3710
http://www.east.isi.edu/~lorin

First page Previous page 1 2 3 4 Next page Last page  View All OpenStack dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.