Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: OpenStack: Dev

[Metering] schema and counter definitions

 

 

OpenStack dev RSS feed   Index | Next | Previous | View Threaded


loic at enovance

Apr 30, 2012, 3:15 AM

Post #1 of 38 (286 views)
Permalink
[Metering] schema and counter definitions

Hi,

To prepare for the next meeting ( thursday 3rd, may 2012 http://wiki.openstack.org/Meetings/MeteringAgenda ) I cleaned up and reorganized the Metering blueprint so that it ( hopefully ) incorporates all the information temporarily stored in the etherpad ( http://etherpad.openstack.org/EfficientMetering revision 67 in case it is vandalized ).

We could start a discussion from the content of the following sections:

http://wiki.openstack.org/EfficientMetering#Counters
http://wiki.openstack.org/EfficientMetering#Storage

and come up with a list of the counters that should exist by default and how they should be stored.

This morning we had a discussion with Zhongyue Luo on irc.freenode.net#openstack-metering about how Dough could use the metering service. Since it already knows about instance creations, counter c1 that records how long a given instance was up is of no interest. However, other counters such as the external bandwidth used would be useful. I advocated that one of the advantages for Dough to rely on metering to collect counters is that it does not need to know about each OpenStack component and can rely on metering to figure out how to extract such counters from nova-compute, nova-network soon to be quantum, nova-volume soon to be cinder, swift, glance and free it from the burden of tracking structural changes.

Cheers

--
Loïc Dachary Chief Research Officer
// eNovance labs http://labs.enovance.com
// ✉ loic [at] enovance ☎ +33 1 49 70 99 82


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


loic at enovance

Apr 30, 2012, 3:46 AM

Post #2 of 38 (276 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On 04/30/2012 12:15 PM, Loic Dachary wrote:
> We could start a discussion from the content of the following sections:
>
> http://wiki.openstack.org/EfficientMetering#Counters
I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N.

It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component.

What do you think ?

Cheers

--
Loïc Dachary Chief Research Officer
// eNovance labs http://labs.enovance.com
// ✉ loic [at] enovance ☎ +33 1 49 70 99 82


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


doug.hellmann at dreamhost

Apr 30, 2012, 6:49 AM

Post #3 of 38 (282 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary <loic [at] enovance> wrote:

> On 04/30/2012 12:15 PM, Loic Dachary wrote:
> > We could start a discussion from the content of the following sections:
> >
> > http://wiki.openstack.org/EfficientMetering#Counters
> I think the rationale of the counter aggregation needs to be explained. My
> understanding is that the metering system will be able to deliver the
> following information: 10 floating IPv4 addresses were allocated to the
> tenant during three months and were leased from provider NNN. From this,
> the billing system could add a line to the invoice : 10 IPv4, $N each =
> $10xN because it has been configured to invoice each IPv4 leased from
> provider NNN for $N.
>
> It is not the purpose of the metering system to display each IPv4 used,
> therefore it only exposes the aggregated information. The counters define
> how the information should be aggregated. If the idea was to expose each
> resource usage individually, defining counters would be meaningless as they
> would duplicate the activity log from each OpenStack component.
>
> What do you think ?
>

At DreamHost we are going to want to show each individual resource (the
IPv4 address, the instance, etc.) along with the charge information. Having
the metering system aggregate that data will make it difficult/impossible
to present the bill summary and detail views that we want. It would be much
more useful for us if it tracked the usage details for each resource, and
let us aggregate the data ourselves.

If other vendors want to show the data differently, perhaps we should
provide separate APIs for retrieving the detailed and aggregate data.

Doug


loic at enovance

Apr 30, 2012, 8:43 AM

Post #4 of 38 (270 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On 04/30/2012 03:49 PM, Doug Hellmann wrote:
>
>
> On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary <loic [at] enovance <mailto:loic [at] enovance>> wrote:
>
> On 04/30/2012 12:15 PM, Loic Dachary wrote:
> > We could start a discussion from the content of the following sections:
> >
> > http://wiki.openstack.org/EfficientMetering#Counters
> I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N.
>
> It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component.
>
> What do you think ?
>
>
> At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves.
>
> If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data.
>
> Doug
>
Hi,

For the record, here is the unfinished conversation we had on IRC

(04:29:06 PM) dhellmann: dachary, did you see my reply about counter definitions on the list today?
(04:39:05 PM) dachary: It means some counters must not be aggregated. Only the amount associated with it is but there is one counter per IP.
(04:55:01 PM) dachary: dhellmann: what about this :the id of the ressource controls the agregation of all counters : if it is missing, all resources of the same kind and their measures are aggregated. Otherwise only the measures are agreggated. http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39 <http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39>
(04:55:58 PM) dachary: it makes me a little unconfortable to define such an "ad-hoc" grouping
(04:56:53 PM) dachary: i.e. you actuall control the aggregation by chosing which value to put in the id column
(04:58:43 PM) dachary: s/actuall/actually/
(05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf
(05:05:54 PM) dachary: I feel like we're trying to resolve a non problem here
(05:08:42 PM) dachary: values need to be aggregated. The raw input is a full description of the resource and a value ( gauge ). The question is how to control the aggregation in a reasonably flexible way.
(05:11:34 PM) dachary: The definition of a counter could probably be described as : the id of a resource and code to fill each column associated with it.

I tried to append the following, but the wiki kept failing.

Propose that the counters are defined by a function instead of being fixed. That helps addressing the issue of aggregating the bandwidth associated to a given IP into a single counter.

Alternate idea :
* a counter is defined by
* a name ( o1, n2, etc. ) that uniquely identifies the nature of the measure ( outbound internet transit, amount of RAM, etc. )
* the component in which it can be found ( nova, swift etc.)
* and by columns, each one is set with the result of aggregate(find(record),record) where
* find() looks for the existing column as found by selecting with the unique key ( maybe the name and the resource id )
* record is a detailed description of the metering event to be aggregated ( http://wiki.openstack.org/SystemUsageData#compute.instance.exists: )
* the aggregate() function returns the updated row. By default it just += the counter value with the old row returned by find()

Cheers

--
Loc Dachary Chief Research Officer
// eNovance labs http://labs.enovance.com
// ? loic [at] enovance ? +33 1 49 70 99 82


doug.hellmann at dreamhost

Apr 30, 2012, 11:03 AM

Post #5 of 38 (275 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary <loic [at] enovance> wrote:

> On 04/30/2012 03:49 PM, Doug Hellmann wrote:
>
>
>
> On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary <loic [at] enovance> wrote:
>
>> On 04/30/2012 12:15 PM, Loic Dachary wrote:
>> > We could start a discussion from the content of the following sections:
>> >
>> > http://wiki.openstack.org/EfficientMetering#Counters
>> I think the rationale of the counter aggregation needs to be explained.
>> My understanding is that the metering system will be able to deliver the
>> following information: 10 floating IPv4 addresses were allocated to the
>> tenant during three months and were leased from provider NNN. From this,
>> the billing system could add a line to the invoice : 10 IPv4, $N each =
>> $10xN because it has been configured to invoice each IPv4 leased from
>> provider NNN for $N.
>>
>> It is not the purpose of the metering system to display each IPv4 used,
>> therefore it only exposes the aggregated information. The counters define
>> how the information should be aggregated. If the idea was to expose each
>> resource usage individually, defining counters would be meaningless as they
>> would duplicate the activity log from each OpenStack component.
>>
>> What do you think ?
>>
>
> At DreamHost we are going to want to show each individual resource (the
> IPv4 address, the instance, etc.) along with the charge information. Having
> the metering system aggregate that data will make it difficult/impossible
> to present the bill summary and detail views that we want. It would be much
> more useful for us if it tracked the usage details for each resource, and
> let us aggregate the data ourselves.
>
> If other vendors want to show the data differently, perhaps we should
> provide separate APIs for retrieving the detailed and aggregate data.
>
> Doug
>
> Hi,
>
> For the record, here is the unfinished conversation we had on IRC
>
> (04:29:06 PM) dhellmann: dachary, did you see my reply about counter
> definitions on the list today?
> (04:39:05 PM) dachary: It means some counters must not be aggregated.
> Only the amount associated with it is but there is one counter per IP.
> (04:55:01 PM) dachary: dhellmann: what about this :the id of the
> ressource controls the agregation of all counters : if it is missing, all
> resources of the same kind and their measures are aggregated. Otherwise
> only the measures are agreggated.
> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39
> (04:55:58 PM) dachary: it makes me a little unconfortable to define such
> an "ad-hoc" grouping
> (04:56:53 PM) dachary: i.e. you actuall control the aggregation by
> chosing which value to put in the id column
> (04:58:43 PM) dachary: s/actuall/actually/
> (05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf
> (05:05:54 PM) dachary: I feel like we're trying to resolve a non problem
> here
> (05:08:42 PM) dachary: values need to be aggregated. The raw input is a
> full description of the resource and a value ( gauge ). The question is how
> to control the aggregation in a reasonably flexible way.
> (05:11:34 PM) dachary: The definition of a counter could probably be
> described as : the id of a resource and code to fill each column associated
> with it.
>
> I tried to append the following, but the wiki kept failing.
>
> Propose that the counters are defined by a function instead of being
> fixed. That helps addressing the issue of aggregating the bandwidth
> associated to a given IP into a single counter.
>
> Alternate idea :
> * a counter is defined by
> * a name ( o1, n2, etc. ) that uniquely identifies the nature of the
> measure ( outbound internet transit, amount of RAM, etc. )
> * the component in which it can be found ( nova, swift etc.)
> * and by columns, each one is set with the result of
> aggregate(find(record),record) where
> * find() looks for the existing column as found by selecting with the
> unique key ( maybe the name and the resource id )
> * record is a detailed description of the metering event to be
> aggregated (
> http://wiki.openstack.org/SystemUsageData#compute.instance.exists: )
> * the aggregate() function returns the updated row. By default it just
> += the counter value with the old row returned by find()
>

Would we want aggregation to occur within the database where we are
collecting events, or should that move somewhere else?


>
>
> Cheers
>
> --
> Loïc Dachary Chief Research Officer
> // eNovance labs http://labs.enovance.com
> // ✉ loic [at] enovance ☎ +33 1 49 70 99 82
>
>


loic at enovance

Apr 30, 2012, 12:43 PM

Post #6 of 38 (270 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On 04/30/2012 08:03 PM, Doug Hellmann wrote:
>
>
> On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary <loic [at] enovance <mailto:loic [at] enovance>> wrote:
>
> On 04/30/2012 03:49 PM, Doug Hellmann wrote:
>>
>>
>> On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary <loic [at] enovance <mailto:loic [at] enovance>> wrote:
>>
>> On 04/30/2012 12:15 PM, Loic Dachary wrote:
>> > We could start a discussion from the content of the following sections:
>> >
>> > http://wiki.openstack.org/EfficientMetering#Counters
>> I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N.
>>
>> It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component.
>>
>> What do you think ?
>>
>>
>> At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves.
>>
>> If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data.
>>
>> Doug
>>
> Hi,
>
> For the record, here is the unfinished conversation we had on IRC
>
> (04:29:06 PM) dhellmann: dachary, did you see my reply about counter definitions on the list today?
> (04:39:05 PM) dachary: It means some counters must not be aggregated. Only the amount associated with it is but there is one counter per IP.
> (04:55:01 PM) dachary: dhellmann: what about this :the id of the ressource controls the agregation of all counters : if it is missing, all resources of the same kind and their measures are aggregated. Otherwise only the measures are agreggated. http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39 <http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39>
> (04:55:58 PM) dachary: it makes me a little unconfortable to define such an "ad-hoc" grouping
> (04:56:53 PM) dachary: i.e. you actuall control the aggregation by chosing which value to put in the id column
> (04:58:43 PM) dachary: s/actuall/actually/
> (05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf
> (05:05:54 PM) dachary: I feel like we're trying to resolve a non problem here
> (05:08:42 PM) dachary: values need to be aggregated. The raw input is a full description of the resource and a value ( gauge ). The question is how to control the aggregation in a reasonably flexible way.
> (05:11:34 PM) dachary: The definition of a counter could probably be described as : the id of a resource and code to fill each column associated with it.
>
> I tried to append the following, but the wiki kept failing.
>
> Propose that the counters are defined by a function instead of being fixed. That helps addressing the issue of aggregating the bandwidth associated to a given IP into a single counter.
>
> Alternate idea :
> * a counter is defined by
> * a name ( o1, n2, etc. ) that uniquely identifies the nature of the measure ( outbound internet transit, amount of RAM, etc. )
> * the component in which it can be found ( nova, swift etc.)
> * and by columns, each one is set with the result of aggregate(find(record),record) where
> * find() looks for the existing column as found by selecting with the unique key ( maybe the name and the resource id )
> * record is a detailed description of the metering event to be aggregated ( http://wiki.openstack.org/SystemUsageData#compute.instance.exists: )
> * the aggregate() function returns the updated row. By default it just += the counter value with the old row returned by find()
>
>
> Would we want aggregation to occur within the database where we are collecting events, or should that move somewhere else?
I assume the events collected by the metering agents will all be archived for auditing (or re-building the database) http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44

Therefore the aggregation should occur when the database is updated to account for a new event.

Does this make sense ? I may have misunderstood part of your question.

Cheers


harlowja at yahoo-inc

Apr 30, 2012, 1:03 PM

Post #7 of 38 (269 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

Agreed, I would get as much low-level data as possible and let other systems combine that as they want to form whatever billing model they choose.

On 4/30/12 6:49 AM, "Doug Hellmann" <doug.hellmann [at] dreamhost> wrote:



On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary <loic [at] enovance> wrote:
On 04/30/2012 12:15 PM, Loic Dachary wrote:
> We could start a discussion from the content of the following sections:
>
> http://wiki.openstack.org/EfficientMetering#Counters
I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N.

It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component.

What do you think ?

At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves.

If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data.

Doug


doug.hellmann at dreamhost

Apr 30, 2012, 2:39 PM

Post #8 of 38 (268 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary <loic [at] enovance> wrote:

> On 04/30/2012 08:03 PM, Doug Hellmann wrote:
>
>
>
> On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary <loic [at] enovance> wrote:
>
>> On 04/30/2012 03:49 PM, Doug Hellmann wrote:
>>
>>
>>
>> On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary <loic [at] enovance> wrote:
>>
>>> On 04/30/2012 12:15 PM, Loic Dachary wrote:
>>> > We could start a discussion from the content of the following sections:
>>> >
>>> > http://wiki.openstack.org/EfficientMetering#Counters
>>> I think the rationale of the counter aggregation needs to be explained.
>>> My understanding is that the metering system will be able to deliver the
>>> following information: 10 floating IPv4 addresses were allocated to the
>>> tenant during three months and were leased from provider NNN. From this,
>>> the billing system could add a line to the invoice : 10 IPv4, $N each =
>>> $10xN because it has been configured to invoice each IPv4 leased from
>>> provider NNN for $N.
>>>
>>> It is not the purpose of the metering system to display each IPv4 used,
>>> therefore it only exposes the aggregated information. The counters define
>>> how the information should be aggregated. If the idea was to expose each
>>> resource usage individually, defining counters would be meaningless as they
>>> would duplicate the activity log from each OpenStack component.
>>>
>>> What do you think ?
>>>
>>
>> At DreamHost we are going to want to show each individual resource (the
>> IPv4 address, the instance, etc.) along with the charge information. Having
>> the metering system aggregate that data will make it difficult/impossible
>> to present the bill summary and detail views that we want. It would be much
>> more useful for us if it tracked the usage details for each resource, and
>> let us aggregate the data ourselves.
>>
>> If other vendors want to show the data differently, perhaps we should
>> provide separate APIs for retrieving the detailed and aggregate data.
>>
>> Doug
>>
>> Hi,
>>
>> For the record, here is the unfinished conversation we had on IRC
>>
>> (04:29:06 PM) dhellmann: dachary, did you see my reply about counter
>> definitions on the list today?
>> (04:39:05 PM) dachary: It means some counters must not be aggregated.
>> Only the amount associated with it is but there is one counter per IP.
>> (04:55:01 PM) dachary: dhellmann: what about this :the id of the
>> ressource controls the agregation of all counters : if it is missing, all
>> resources of the same kind and their measures are aggregated. Otherwise
>> only the measures are agreggated.
>> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39
>> (04:55:58 PM) dachary: it makes me a little unconfortable to define such
>> an "ad-hoc" grouping
>> (04:56:53 PM) dachary: i.e. you actuall control the aggregation by
>> chosing which value to put in the id column
>> (04:58:43 PM) dachary: s/actuall/actually/
>> (05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf
>> (05:05:54 PM) dachary: I feel like we're trying to resolve a non problem
>> here
>> (05:08:42 PM) dachary: values need to be aggregated. The raw input is a
>> full description of the resource and a value ( gauge ). The question is how
>> to control the aggregation in a reasonably flexible way.
>> (05:11:34 PM) dachary: The definition of a counter could probably be
>> described as : the id of a resource and code to fill each column associated
>> with it.
>>
>> I tried to append the following, but the wiki kept failing.
>>
>> Propose that the counters are defined by a function instead of being
>> fixed. That helps addressing the issue of aggregating the bandwidth
>> associated to a given IP into a single counter.
>>
>> Alternate idea :
>> * a counter is defined by
>> * a name ( o1, n2, etc. ) that uniquely identifies the nature of the
>> measure ( outbound internet transit, amount of RAM, etc. )
>> * the component in which it can be found ( nova, swift etc.)
>> * and by columns, each one is set with the result of
>> aggregate(find(record),record) where
>> * find() looks for the existing column as found by selecting with the
>> unique key ( maybe the name and the resource id )
>> * record is a detailed description of the metering event to be
>> aggregated (
>> http://wiki.openstack.org/SystemUsageData#compute.instance.exists: )
>> * the aggregate() function returns the updated row. By default it just
>> += the counter value with the old row returned by find()
>>
>
> Would we want aggregation to occur within the database where we are
> collecting events, or should that move somewhere else?
>
> I assume the events collected by the metering agents will all be archived
> for auditing (or re-building the database)
> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44
>
> Therefore the aggregation should occur when the database is updated to
> account for a new event.
>
> Does this make sense ? I may have misunderstood part of your question.
>

I guess what I don't understand is why the aggregated data is written back
to the metering database at all. If it's in the same database, it seems
like it should be in a different "table" (or equivalent) so the original
data is left alone.

Maybe it's time to start focusing these discussions on user stories?


loic at enovance

May 1, 2012, 2:23 AM

Post #9 of 38 (267 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On 04/30/2012 11:39 PM, Doug Hellmann wrote:
>
>
> On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary <loic [at] enovance <mailto:loic [at] enovance>> wrote:
>
> On 04/30/2012 08:03 PM, Doug Hellmann wrote:
>>
>>
>> On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary <loic [at] enovance <mailto:loic [at] enovance>> wrote:
>>
>> On 04/30/2012 03:49 PM, Doug Hellmann wrote:
>>>
>>>
>>> On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary <loic [at] enovance <mailto:loic [at] enovance>> wrote:
>>>
>>> On 04/30/2012 12:15 PM, Loic Dachary wrote:
>>> > We could start a discussion from the content of the following sections:
>>> >
>>> > http://wiki.openstack.org/EfficientMetering#Counters
>>> I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N.
>>>
>>> It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component.
>>>
>>> What do you think ?
>>>
>>>
>>> At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves.
>>>
>>> If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data.
>>>
>>> Doug
>>>
>> Hi,
>>
>> For the record, here is the unfinished conversation we had on IRC
>>
>> (04:29:06 PM) dhellmann: dachary, did you see my reply about counter definitions on the list today?
>> (04:39:05 PM) dachary: It means some counters must not be aggregated. Only the amount associated with it is but there is one counter per IP.
>> (04:55:01 PM) dachary: dhellmann: what about this :the id of the ressource controls the agregation of all counters : if it is missing, all resources of the same kind and their measures are aggregated. Otherwise only the measures are agreggated. http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39 <http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39>
>> (04:55:58 PM) dachary: it makes me a little unconfortable to define such an "ad-hoc" grouping
>> (04:56:53 PM) dachary: i.e. you actuall control the aggregation by chosing which value to put in the id column
>> (04:58:43 PM) dachary: s/actuall/actually/
>> (05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf
>> (05:05:54 PM) dachary: I feel like we're trying to resolve a non problem here
>> (05:08:42 PM) dachary: values need to be aggregated. The raw input is a full description of the resource and a value ( gauge ). The question is how to control the aggregation in a reasonably flexible way.
>> (05:11:34 PM) dachary: The definition of a counter could probably be described as : the id of a resource and code to fill each column associated with it.
>>
>> I tried to append the following, but the wiki kept failing.
>>
>> Propose that the counters are defined by a function instead of being fixed. That helps addressing the issue of aggregating the bandwidth associated to a given IP into a single counter.
>>
>> Alternate idea :
>> * a counter is defined by
>> * a name ( o1, n2, etc. ) that uniquely identifies the nature of the measure ( outbound internet transit, amount of RAM, etc. )
>> * the component in which it can be found ( nova, swift etc.)
>> * and by columns, each one is set with the result of aggregate(find(record),record) where
>> * find() looks for the existing column as found by selecting with the unique key ( maybe the name and the resource id )
>> * record is a detailed description of the metering event to be aggregated ( http://wiki.openstack.org/SystemUsageData#compute.instance.exists: )
>> * the aggregate() function returns the updated row. By default it just += the counter value with the old row returned by find()
>>
>>
>> Would we want aggregation to occur within the database where we are collecting events, or should that move somewhere else?
> I assume the events collected by the metering agents will all be archived for auditing (or re-building the database) http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44 <http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44>
>
> Therefore the aggregation should occur when the database is updated to account for a new event.
>
> Does this make sense ? I may have misunderstood part of your question.
>
>
> I guess what I don't understand is why the aggregated data is written back to the metering database at all. If it's in the same database, it seems like it should be in a different "table" (or equivalent) so the original data is left alone.
In my view the events are not stored in a database, they are merely appended to a log file. The database is built from the events with aggregated data. I now understand that you (and Joshua Harlow) think it's better to not aggregate the data and let the billing system do this job.
>
> Maybe it's time to start focusing these discussions on user stories?
>
I agree. Would you like to go first ?

Cheers

--
Loc Dachary Chief Research Officer
// eNovance labs http://labs.enovance.com
// ? loic [at] enovance ? +33 1 49 70 99 82


nick.barcet at canonical

May 1, 2012, 7:38 AM

Post #10 of 38 (261 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On 05/01/2012 02:23 AM, Loic Dachary wrote:
> On 04/30/2012 11:39 PM, Doug Hellmann wrote:
>>
>>
>> On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary <loic [at] enovance
>> <mailto:loic [at] enovance>> wrote:
>>
>> On 04/30/2012 08:03 PM, Doug Hellmann wrote:
>>>
>>>
>>> On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary <loic [at] enovance
>>> <mailto:loic [at] enovance>> wrote:
>>>
>>> On 04/30/2012 03:49 PM, Doug Hellmann wrote:
>>>>
>>>>
>>>> On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary
>>>> <loic [at] enovance <mailto:loic [at] enovance>> wrote:
>>>>
>>>> On 04/30/2012 12:15 PM, Loic Dachary wrote:
>>>> > We could start a discussion from the content of the
>>>> following sections:
>>>> >
>>>> > http://wiki.openstack.org/EfficientMetering#Counters
>>>> I think the rationale of the counter aggregation needs
>>>> to be explained. My understanding is that the metering
>>>> system will be able to deliver the following
>>>> information: 10 floating IPv4 addresses were allocated
>>>> to the tenant during three months and were leased from
>>>> provider NNN. From this, the billing system could add a
>>>> line to the invoice : 10 IPv4, $N each = $10xN because
>>>> it has been configured to invoice each IPv4 leased from
>>>> provider NNN for $N.
>>>>
>>>> It is not the purpose of the metering system to display
>>>> each IPv4 used, therefore it only exposes the aggregated
>>>> information. The counters define how the information
>>>> should be aggregated. If the idea was to expose each
>>>> resource usage individually, defining counters would be
>>>> meaningless as they would duplicate the activity log
>>>> from each OpenStack component.
>>>>
>>>> What do you think ?
>>>>
>>>>
>>>> At DreamHost we are going to want to show each individual
>>>> resource (the IPv4 address, the instance, etc.) along with
>>>> the charge information. Having the metering system aggregate
>>>> that data will make it difficult/impossible to present the
>>>> bill summary and detail views that we want. It would be much
>>>> more useful for us if it tracked the usage details for each
>>>> resource, and let us aggregate the data ourselves.
>>>>
>>>> If other vendors want to show the data differently, perhaps
>>>> we should provide separate APIs for retrieving the detailed
>>>> and aggregate data.
>>>>
>>>> Doug
>>>>
>>> Hi,
>>>
>>> For the record, here is the unfinished conversation we had on IRC
>>>
>>> (04:29:06 PM) dhellmann: dachary, did you see my reply about
>>> counter definitions on the list today?
>>> (04:39:05 PM) dachary: It means some counters must not be
>>> aggregated. Only the amount associated with it is but there
>>> is one counter per IP.
>>> (04:55:01 PM) dachary: dhellmann: what about this :the id of
>>> the ressource controls the agregation of all counters : if it
>>> is missing, all resources of the same kind and their measures
>>> are aggregated. Otherwise only the measures are agreggated.
>>> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39
>>> <http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39>
>>> (04:55:58 PM) dachary: it makes me a little unconfortable to
>>> define such an "ad-hoc" grouping
>>> (04:56:53 PM) dachary: i.e. you actuall control the
>>> aggregation by chosing which value to put in the id column
>>> (04:58:43 PM) dachary: s/actuall/actually/
>>> (05:05:38 PM) ***dachary reading
>>> http://www.ogf.org/documents/GFD.98.pdf
>>> (05:05:54 PM) dachary: I feel like we're trying to resolve a
>>> non problem here
>>> (05:08:42 PM) dachary: values need to be aggregated. The raw
>>> input is a full description of the resource and a value (
>>> gauge ). The question is how to control the aggregation in a
>>> reasonably flexible way.
>>> (05:11:34 PM) dachary: The definition of a counter could
>>> probably be described as : the id of a resource and code to
>>> fill each column associated with it.
>>>
>>> I tried to append the following, but the wiki kept failing.
>>>
>>> Propose that the counters are defined by a function instead
>>> of being fixed. That helps addressing the issue of
>>> aggregating the bandwidth associated to a given IP into a
>>> single counter.
>>>
>>> Alternate idea :
>>> * a counter is defined by
>>> * a name ( o1, n2, etc. ) that uniquely identifies the
>>> nature of the measure ( outbound internet transit, amount of
>>> RAM, etc. )
>>> * the component in which it can be found ( nova, swift etc.)
>>> * and by columns, each one is set with the result of
>>> aggregate(find(record),record) where
>>> * find() looks for the existing column as found by
>>> selecting with the unique key ( maybe the name and the
>>> resource id )
>>> * record is a detailed description of the metering event to
>>> be aggregated (
>>> http://wiki.openstack.org/SystemUsageData#compute.instance.exists:
>>> )
>>> * the aggregate() function returns the updated row. By
>>> default it just += the counter value with the old row
>>> returned by find()
>>>
>>>
>>> Would we want aggregation to occur within the database where we
>>> are collecting events, or should that move somewhere else?
>> I assume the events collected by the metering agents will all be
>> archived for auditing (or re-building the database)
>> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44
>> <http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44>
>>
>> Therefore the aggregation should occur when the database is
>> updated to account for a new event.
>>
>> Does this make sense ? I may have misunderstood part of your question.
>>
>>
>> I guess what I don't understand is why the aggregated data is written
>> back to the metering database at all. If it's in the same database, it
>> seems like it should be in a different "table" (or equivalent) so the
>> original data is left alone.
> In my view the events are not stored in a database, they are merely
> appended to a log file. The database is built from the events with
> aggregated data. I now understand that you (and Joshua Harlow) think
> it's better to not aggregate the data and let the billing system do this
> job.

My intent when writing the blueprint was that each event would be
recorded atomically in the database, as it is the only way to control
that we have not missed any. Aggregation, should be done at the external
API level if the request is to get the sum of a given counter.

What I missed in the blueprint and seems to be appearing clearly now, is
that an event need to be able to carry the "object-reference" for which
it was collected, and this would seem highly necessary looking at the
messages in this thread. A metering event would essentially be defined
by (who, what, which) instead of a simple (who, what). As a consequence
we would need to extend the DB schema to add this [which/object
reference], and make sure that we carry it as well when we will work on
the message API format definition.

How does this sound?

Nick

>> Maybe it's time to start focusing these discussions on user stories?
>>
> I agree. Would you like to go first ?
>
> Cheers
>
> --
> Loïc Dachary Chief Research Officer
> // eNovance labs http://labs.enovance.com
> // ✉ loic [at] enovance ☎ +33 1 49 70 99 82
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
Attachments: signature.asc (0.88 KB)


loic at enovance

May 1, 2012, 8:48 AM

Post #11 of 38 (262 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On 05/01/2012 04:38 PM, Nick Barcet wrote:
> On 05/01/2012 02:23 AM, Loic Dachary wrote:
>> On 04/30/2012 11:39 PM, Doug Hellmann wrote:
>>>
>>> On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary <loic [at] enovance
>>> <mailto:loic [at] enovance>> wrote:
>>>
>>> On 04/30/2012 08:03 PM, Doug Hellmann wrote:
>>>>
>>>> On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary <loic [at] enovance
>>>> <mailto:loic [at] enovance>> wrote:
>>>>
>>>> On 04/30/2012 03:49 PM, Doug Hellmann wrote:
>>>>>
>>>>> On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary
>>>>> <loic [at] enovance <mailto:loic [at] enovance>> wrote:
>>>>>
>>>>> On 04/30/2012 12:15 PM, Loic Dachary wrote:
>>>>> > We could start a discussion from the content of the
>>>>> following sections:
>>>>> >
>>>>> > http://wiki.openstack.org/EfficientMetering#Counters
>>>>> I think the rationale of the counter aggregation needs
>>>>> to be explained. My understanding is that the metering
>>>>> system will be able to deliver the following
>>>>> information: 10 floating IPv4 addresses were allocated
>>>>> to the tenant during three months and were leased from
>>>>> provider NNN. From this, the billing system could add a
>>>>> line to the invoice : 10 IPv4, $N each = $10xN because
>>>>> it has been configured to invoice each IPv4 leased from
>>>>> provider NNN for $N.
>>>>>
>>>>> It is not the purpose of the metering system to display
>>>>> each IPv4 used, therefore it only exposes the aggregated
>>>>> information. The counters define how the information
>>>>> should be aggregated. If the idea was to expose each
>>>>> resource usage individually, defining counters would be
>>>>> meaningless as they would duplicate the activity log
>>>>> from each OpenStack component.
>>>>>
>>>>> What do you think ?
>>>>>
>>>>>
>>>>> At DreamHost we are going to want to show each individual
>>>>> resource (the IPv4 address, the instance, etc.) along with
>>>>> the charge information. Having the metering system aggregate
>>>>> that data will make it difficult/impossible to present the
>>>>> bill summary and detail views that we want. It would be much
>>>>> more useful for us if it tracked the usage details for each
>>>>> resource, and let us aggregate the data ourselves.
>>>>>
>>>>> If other vendors want to show the data differently, perhaps
>>>>> we should provide separate APIs for retrieving the detailed
>>>>> and aggregate data.
>>>>>
>>>>> Doug
>>>>>
>>>> Hi,
>>>>
>>>> For the record, here is the unfinished conversation we had on IRC
>>>>
>>>> (04:29:06 PM) dhellmann: dachary, did you see my reply about
>>>> counter definitions on the list today?
>>>> (04:39:05 PM) dachary: It means some counters must not be
>>>> aggregated. Only the amount associated with it is but there
>>>> is one counter per IP.
>>>> (04:55:01 PM) dachary: dhellmann: what about this :the id of
>>>> the ressource controls the agregation of all counters : if it
>>>> is missing, all resources of the same kind and their measures
>>>> are aggregated. Otherwise only the measures are agreggated.
>>>> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39
>>>> <http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39>
>>>> (04:55:58 PM) dachary: it makes me a little unconfortable to
>>>> define such an "ad-hoc" grouping
>>>> (04:56:53 PM) dachary: i.e. you actuall control the
>>>> aggregation by chosing which value to put in the id column
>>>> (04:58:43 PM) dachary: s/actuall/actually/
>>>> (05:05:38 PM) ***dachary reading
>>>> http://www.ogf.org/documents/GFD.98.pdf
>>>> (05:05:54 PM) dachary: I feel like we're trying to resolve a
>>>> non problem here
>>>> (05:08:42 PM) dachary: values need to be aggregated. The raw
>>>> input is a full description of the resource and a value (
>>>> gauge ). The question is how to control the aggregation in a
>>>> reasonably flexible way.
>>>> (05:11:34 PM) dachary: The definition of a counter could
>>>> probably be described as : the id of a resource and code to
>>>> fill each column associated with it.
>>>>
>>>> I tried to append the following, but the wiki kept failing.
>>>>
>>>> Propose that the counters are defined by a function instead
>>>> of being fixed. That helps addressing the issue of
>>>> aggregating the bandwidth associated to a given IP into a
>>>> single counter.
>>>>
>>>> Alternate idea :
>>>> * a counter is defined by
>>>> * a name ( o1, n2, etc. ) that uniquely identifies the
>>>> nature of the measure ( outbound internet transit, amount of
>>>> RAM, etc. )
>>>> * the component in which it can be found ( nova, swift etc.)
>>>> * and by columns, each one is set with the result of
>>>> aggregate(find(record),record) where
>>>> * find() looks for the existing column as found by
>>>> selecting with the unique key ( maybe the name and the
>>>> resource id )
>>>> * record is a detailed description of the metering event to
>>>> be aggregated (
>>>> http://wiki.openstack.org/SystemUsageData#compute.instance.exists:
>>>> )
>>>> * the aggregate() function returns the updated row. By
>>>> default it just += the counter value with the old row
>>>> returned by find()
>>>>
>>>>
>>>> Would we want aggregation to occur within the database where we
>>>> are collecting events, or should that move somewhere else?
>>> I assume the events collected by the metering agents will all be
>>> archived for auditing (or re-building the database)
>>> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44
>>> <http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44>
>>>
>>> Therefore the aggregation should occur when the database is
>>> updated to account for a new event.
>>>
>>> Does this make sense ? I may have misunderstood part of your question.
>>>
>>>
>>> I guess what I don't understand is why the aggregated data is written
>>> back to the metering database at all. If it's in the same database, it
>>> seems like it should be in a different "table" (or equivalent) so the
>>> original data is left alone.
>> In my view the events are not stored in a database, they are merely
>> appended to a log file. The database is built from the events with
>> aggregated data. I now understand that you (and Joshua Harlow) think
>> it's better to not aggregate the data and let the billing system do this
>> job.
> My intent when writing the blueprint was that each event would be
> recorded atomically in the database, as it is the only way to control
> that we have not missed any. Aggregation, should be done at the external
> API level if the request is to get the sum of a given counter.
>
> What I missed in the blueprint and seems to be appearing clearly now, is
> that an event need to be able to carry the "object-reference" for which
> it was collected, and this would seem highly necessary looking at the
> messages in this thread. A metering event would essentially be defined
> by (who, what, which) instead of a simple (who, what). As a consequence
> we would need to extend the DB schema to add this [which/object
> reference], and make sure that we carry it as well when we will work on
> the message API format definition.
>
> How does this sound?
Hi,

I agree and I think it makes the blueprint simpler while addressing the concerned expressed in this thread. The database will have to store a lot more events and we will have to be careful to make sure it scales. I translated your suggestion in the blueprint:

http://wiki.openstack.org/EfficientMetering?action=diff&rev2=46&rev1=45

Feel free to fix the blueprint if I misrepresented it.

Cheers
>
> Nick
>
>>> Maybe it's time to start focusing these discussions on user stories?
>>>
>> I agree. Would you like to go first ?
>>
>> Cheers
>>
>> --
>> Loc Dachary Chief Research Officer
>> // eNovance labs http://labs.enovance.com
>> // ? loic [at] enovance ? +33 1 49 70 99 82
>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack
>> Post to : openstack [at] lists
>> Unsubscribe : https://launchpad.net/~openstack
>> More help : https://help.launchpad.net/ListHelp
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp


--
Loc Dachary Chief Research Officer
// eNovance labs http://labs.enovance.com
// ? loic [at] enovance ? +33 1 49 70 99 82


doug.hellmann at dreamhost

May 1, 2012, 8:49 AM

Post #12 of 38 (258 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On Tue, May 1, 2012 at 10:38 AM, Nick Barcet <nick.barcet [at] canonical>wrote:

> On 05/01/2012 02:23 AM, Loic Dachary wrote:
> > On 04/30/2012 11:39 PM, Doug Hellmann wrote:
> >>
> >>
> >> On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary <loic [at] enovance
> >> <mailto:loic [at] enovance>> wrote:
> >>
> >> On 04/30/2012 08:03 PM, Doug Hellmann wrote:
> >>>
> >>>
> >>> On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary <loic [at] enovance
> >>> <mailto:loic [at] enovance>> wrote:
> >>>
> >>> On 04/30/2012 03:49 PM, Doug Hellmann wrote:
> >>>>
> >>>>
> >>>> On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary
> >>>> <loic [at] enovance <mailto:loic [at] enovance>> wrote:
> >>>>
> >>>> On 04/30/2012 12:15 PM, Loic Dachary wrote:
> >>>> > We could start a discussion from the content of the
> >>>> following sections:
> >>>> >
> >>>> > http://wiki.openstack.org/EfficientMetering#Counters
> >>>> I think the rationale of the counter aggregation needs
> >>>> to be explained. My understanding is that the metering
> >>>> system will be able to deliver the following
> >>>> information: 10 floating IPv4 addresses were allocated
> >>>> to the tenant during three months and were leased from
> >>>> provider NNN. From this, the billing system could add a
> >>>> line to the invoice : 10 IPv4, $N each = $10xN because
> >>>> it has been configured to invoice each IPv4 leased from
> >>>> provider NNN for $N.
> >>>>
> >>>> It is not the purpose of the metering system to display
> >>>> each IPv4 used, therefore it only exposes the aggregated
> >>>> information. The counters define how the information
> >>>> should be aggregated. If the idea was to expose each
> >>>> resource usage individually, defining counters would be
> >>>> meaningless as they would duplicate the activity log
> >>>> from each OpenStack component.
> >>>>
> >>>> What do you think ?
> >>>>
> >>>>
> >>>> At DreamHost we are going to want to show each individual
> >>>> resource (the IPv4 address, the instance, etc.) along with
> >>>> the charge information. Having the metering system aggregate
> >>>> that data will make it difficult/impossible to present the
> >>>> bill summary and detail views that we want. It would be much
> >>>> more useful for us if it tracked the usage details for each
> >>>> resource, and let us aggregate the data ourselves.
> >>>>
> >>>> If other vendors want to show the data differently, perhaps
> >>>> we should provide separate APIs for retrieving the detailed
> >>>> and aggregate data.
> >>>>
> >>>> Doug
> >>>>
> >>> Hi,
> >>>
> >>> For the record, here is the unfinished conversation we had on
> IRC
> >>>
> >>> (04:29:06 PM) dhellmann: dachary, did you see my reply about
> >>> counter definitions on the list today?
> >>> (04:39:05 PM) dachary: It means some counters must not be
> >>> aggregated. Only the amount associated with it is but there
> >>> is one counter per IP.
> >>> (04:55:01 PM) dachary: dhellmann: what about this :the id of
> >>> the ressource controls the agregation of all counters : if it
> >>> is missing, all resources of the same kind and their measures
> >>> are aggregated. Otherwise only the measures are agreggated.
> >>>
> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39
> >>> <
> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39>
> >>> (04:55:58 PM) dachary: it makes me a little unconfortable to
> >>> define such an "ad-hoc" grouping
> >>> (04:56:53 PM) dachary: i.e. you actuall control the
> >>> aggregation by chosing which value to put in the id column
> >>> (04:58:43 PM) dachary: s/actuall/actually/
> >>> (05:05:38 PM) ***dachary reading
> >>> http://www.ogf.org/documents/GFD.98.pdf
> >>> (05:05:54 PM) dachary: I feel like we're trying to resolve a
> >>> non problem here
> >>> (05:08:42 PM) dachary: values need to be aggregated. The raw
> >>> input is a full description of the resource and a value (
> >>> gauge ). The question is how to control the aggregation in a
> >>> reasonably flexible way.
> >>> (05:11:34 PM) dachary: The definition of a counter could
> >>> probably be described as : the id of a resource and code to
> >>> fill each column associated with it.
> >>>
> >>> I tried to append the following, but the wiki kept failing.
> >>>
> >>> Propose that the counters are defined by a function instead
> >>> of being fixed. That helps addressing the issue of
> >>> aggregating the bandwidth associated to a given IP into a
> >>> single counter.
> >>>
> >>> Alternate idea :
> >>> * a counter is defined by
> >>> * a name ( o1, n2, etc. ) that uniquely identifies the
> >>> nature of the measure ( outbound internet transit, amount of
> >>> RAM, etc. )
> >>> * the component in which it can be found ( nova, swift etc.)
> >>> * and by columns, each one is set with the result of
> >>> aggregate(find(record),record) where
> >>> * find() looks for the existing column as found by
> >>> selecting with the unique key ( maybe the name and the
> >>> resource id )
> >>> * record is a detailed description of the metering event to
> >>> be aggregated (
> >>>
> http://wiki.openstack.org/SystemUsageData#compute.instance.exists:
> >>> )
> >>> * the aggregate() function returns the updated row. By
> >>> default it just += the counter value with the old row
> >>> returned by find()
> >>>
> >>>
> >>> Would we want aggregation to occur within the database where we
> >>> are collecting events, or should that move somewhere else?
> >> I assume the events collected by the metering agents will all be
> >> archived for auditing (or re-building the database)
> >>
> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44
> >> <
> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44>
> >>
> >> Therefore the aggregation should occur when the database is
> >> updated to account for a new event.
> >>
> >> Does this make sense ? I may have misunderstood part of your
> question.
> >>
> >>
> >> I guess what I don't understand is why the aggregated data is written
> >> back to the metering database at all. If it's in the same database, it
> >> seems like it should be in a different "table" (or equivalent) so the
> >> original data is left alone.
> > In my view the events are not stored in a database, they are merely
> > appended to a log file. The database is built from the events with
> > aggregated data. I now understand that you (and Joshua Harlow) think
> > it's better to not aggregate the data and let the billing system do this
> > job.
>
> My intent when writing the blueprint was that each event would be
> recorded atomically in the database, as it is the only way to control
> that we have not missed any. Aggregation, should be done at the external
> API level if the request is to get the sum of a given counter.
>

That matches what I was thinking. The "log file" that Loic mentioned would
in fact be a database that can handle a lot of writes. We could use some
sort of simple file format, but since we're going to have to read and parse
the log anyway, we might as well use a tool that makes that easy.

Aggregation could happen either in a metering API based on the query, or an
external app could retrieve a large dataset and manage the aggregation
itself.


> What I missed in the blueprint and seems to be appearing clearly now, is
> that an event need to be able to carry the "object-reference" for which
> it was collected, and this would seem highly necessary looking at the
> messages in this thread. A metering event would essentially be defined
> by (who, what, which) instead of a simple (who, what). As a consequence
> we would need to extend the DB schema to add this [which/object
> reference], and make sure that we carry it as well when we will work on
> the message API format definition.
>
> How does this sound?
>

I think so. A lot of these sorts of issues can probably be fixed by being
careful about how we define the measurements. For example, I may want to be
able to show a customer the network bandwidth used per server, not just per
network. If we measure the bandwidth consumed by each VIF, the aggregation
code can take care of summarizing by network (because we know where the VIF
is) and/or server (because we know which server has the VIF).

We may need to record more detail than a simple "which," though, because it
may be possible to change some information relevant for calculating the
billing rate later. For example, a tenant can resize an instance, which
would usually cause a change in the billing rate. Some of the relationships
might change, too (Is it possible to move a VIF between networks?).

At first I thought this might require separate table definitions per
resource type (instance, network, etc.) but re-reading the table of
counters in EfficientMetering I guess this is handled by measuring things
like CPU, RAM, and block storage as separate counters? So a single event
for creating a new instance might result in several records being written
to the database, with the "which" set to the instance identifier. The data
could then be presented as a unified "resource usage" report for that
server.

I think that works, but it may make the job of calculating the bill harder.
We are planning to follow the model of specifying rates per size, so we
would have to figure out which combination of CPU, RAM, and root volume
storage matches up with a given size to determine the rate.

Another piece I've been thinking about is handling boundary conditions when
resource create and delete events don't both fall inside a billing cycle
(or within the granularity of the metering system). That shouldn't be part
of logging the events, necessarily, but it could be a reusable component
that feeds into producing the aggregated data (either through the API, or
as a way of processing the results returned by the API).

>> Maybe it's time to start focusing these discussions on user stories?
> >>
> > I agree. Would you like to go first ?
>

These are "things that might happen" use cases rather than "user stories,"
but let's see where they take us:

1. User creates an instance, waits some period of time, then terminates it.
- Vary the period of time to allow the events to both fall within the
metering granularity window, to overlap an entire window, to start in one
window and end in another.
- The same variations for "billing cycle" instead of "metering granularity
window."
2. User creates an instance, waits some period of time, then resizes it.
- Vary the period of time as above.
- Do we need variations for resizing up and down?
3. User creates an instance but it fails to create properly (provider
issue).
4. User creates an instance but it fails to boot after creation (bad image).
5. User create volume storage, adds it to an existing instance, waits a
period of time, then deletes the volume.
- Vary the period of time as above.
6. User creates volume storage, adds it to an existing instance, waits a
period of time, then terminates the instance (I'm not sure what happens to
the volume in that case, maybe it still exists?)

A provider-related story might be:

1. As a provider, I can query the metering API to determine the activity
for a tenant within a given period of time.

Although that's pretty vague. :-)

Doug


doug.hellmann at dreamhost

May 1, 2012, 8:52 AM

Post #13 of 38 (260 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On Tue, May 1, 2012 at 11:49 AM, Doug Hellmann
<doug.hellmann [at] dreamhost>wrote:

>
>
> On Tue, May 1, 2012 at 10:38 AM, Nick Barcet <nick.barcet [at] canonical>wrote:
>
>> On 05/01/2012 02:23 AM, Loic Dachary wrote:
>> > On 04/30/2012 11:39 PM, Doug Hellmann wrote:
>> >>
>> >>
>> >> On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary <loic [at] enovance
>> >> <mailto:loic [at] enovance>> wrote:
>> >>
>> >> On 04/30/2012 08:03 PM, Doug Hellmann wrote:
>> >>>
>> >>>
>> >>> On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary <loic [at] enovance
>> >>> <mailto:loic [at] enovance>> wrote:
>> >>>
>> >>> On 04/30/2012 03:49 PM, Doug Hellmann wrote:
>> >>>>
>> >>>>
>> >>>> On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary
>> >>>> <loic [at] enovance <mailto:loic [at] enovance>> wrote:
>> >>>>
>> >>>> On 04/30/2012 12:15 PM, Loic Dachary wrote:
>> >>>> > We could start a discussion from the content of the
>> >>>> following sections:
>> >>>> >
>> >>>> > http://wiki.openstack.org/EfficientMetering#Counters
>> >>>> I think the rationale of the counter aggregation needs
>> >>>> to be explained. My understanding is that the metering
>> >>>> system will be able to deliver the following
>> >>>> information: 10 floating IPv4 addresses were allocated
>> >>>> to the tenant during three months and were leased from
>> >>>> provider NNN. From this, the billing system could add a
>> >>>> line to the invoice : 10 IPv4, $N each = $10xN because
>> >>>> it has been configured to invoice each IPv4 leased from
>> >>>> provider NNN for $N.
>> >>>>
>> >>>> It is not the purpose of the metering system to display
>> >>>> each IPv4 used, therefore it only exposes the aggregated
>> >>>> information. The counters define how the information
>> >>>> should be aggregated. If the idea was to expose each
>> >>>> resource usage individually, defining counters would be
>> >>>> meaningless as they would duplicate the activity log
>> >>>> from each OpenStack component.
>> >>>>
>> >>>> What do you think ?
>> >>>>
>> >>>>
>> >>>> At DreamHost we are going to want to show each individual
>> >>>> resource (the IPv4 address, the instance, etc.) along with
>> >>>> the charge information. Having the metering system aggregate
>> >>>> that data will make it difficult/impossible to present the
>> >>>> bill summary and detail views that we want. It would be much
>> >>>> more useful for us if it tracked the usage details for each
>> >>>> resource, and let us aggregate the data ourselves.
>> >>>>
>> >>>> If other vendors want to show the data differently, perhaps
>> >>>> we should provide separate APIs for retrieving the detailed
>> >>>> and aggregate data.
>> >>>>
>> >>>> Doug
>> >>>>
>> >>> Hi,
>> >>>
>> >>> For the record, here is the unfinished conversation we had on
>> IRC
>> >>>
>> >>> (04:29:06 PM) dhellmann: dachary, did you see my reply about
>> >>> counter definitions on the list today?
>> >>> (04:39:05 PM) dachary: It means some counters must not be
>> >>> aggregated. Only the amount associated with it is but there
>> >>> is one counter per IP.
>> >>> (04:55:01 PM) dachary: dhellmann: what about this :the id of
>> >>> the ressource controls the agregation of all counters : if it
>> >>> is missing, all resources of the same kind and their measures
>> >>> are aggregated. Otherwise only the measures are agreggated.
>> >>>
>> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39
>> >>> <
>> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39>
>> >>> (04:55:58 PM) dachary: it makes me a little unconfortable to
>> >>> define such an "ad-hoc" grouping
>> >>> (04:56:53 PM) dachary: i.e. you actuall control the
>> >>> aggregation by chosing which value to put in the id column
>> >>> (04:58:43 PM) dachary: s/actuall/actually/
>> >>> (05:05:38 PM) ***dachary reading
>> >>> http://www.ogf.org/documents/GFD.98.pdf
>> >>> (05:05:54 PM) dachary: I feel like we're trying to resolve a
>> >>> non problem here
>> >>> (05:08:42 PM) dachary: values need to be aggregated. The raw
>> >>> input is a full description of the resource and a value (
>> >>> gauge ). The question is how to control the aggregation in a
>> >>> reasonably flexible way.
>> >>> (05:11:34 PM) dachary: The definition of a counter could
>> >>> probably be described as : the id of a resource and code to
>> >>> fill each column associated with it.
>> >>>
>> >>> I tried to append the following, but the wiki kept failing.
>> >>>
>> >>> Propose that the counters are defined by a function instead
>> >>> of being fixed. That helps addressing the issue of
>> >>> aggregating the bandwidth associated to a given IP into a
>> >>> single counter.
>> >>>
>> >>> Alternate idea :
>> >>> * a counter is defined by
>> >>> * a name ( o1, n2, etc. ) that uniquely identifies the
>> >>> nature of the measure ( outbound internet transit, amount of
>> >>> RAM, etc. )
>> >>> * the component in which it can be found ( nova, swift etc.)
>> >>> * and by columns, each one is set with the result of
>> >>> aggregate(find(record),record) where
>> >>> * find() looks for the existing column as found by
>> >>> selecting with the unique key ( maybe the name and the
>> >>> resource id )
>> >>> * record is a detailed description of the metering event to
>> >>> be aggregated (
>> >>>
>> http://wiki.openstack.org/SystemUsageData#compute.instance.exists:
>> >>> )
>> >>> * the aggregate() function returns the updated row. By
>> >>> default it just += the counter value with the old row
>> >>> returned by find()
>> >>>
>> >>>
>> >>> Would we want aggregation to occur within the database where we
>> >>> are collecting events, or should that move somewhere else?
>> >> I assume the events collected by the metering agents will all be
>> >> archived for auditing (or re-building the database)
>> >>
>> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44
>> >> <
>> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44>
>> >>
>> >> Therefore the aggregation should occur when the database is
>> >> updated to account for a new event.
>> >>
>> >> Does this make sense ? I may have misunderstood part of your
>> question.
>> >>
>> >>
>> >> I guess what I don't understand is why the aggregated data is written
>> >> back to the metering database at all. If it's in the same database, it
>> >> seems like it should be in a different "table" (or equivalent) so the
>> >> original data is left alone.
>> > In my view the events are not stored in a database, they are merely
>> > appended to a log file. The database is built from the events with
>> > aggregated data. I now understand that you (and Joshua Harlow) think
>> > it's better to not aggregate the data and let the billing system do this
>> > job.
>>
>> My intent when writing the blueprint was that each event would be
>> recorded atomically in the database, as it is the only way to control
>> that we have not missed any. Aggregation, should be done at the external
>> API level if the request is to get the sum of a given counter.
>>
>
> That matches what I was thinking. The "log file" that Loic mentioned would
> in fact be a database that can handle a lot of writes. We could use some
> sort of simple file format, but since we're going to have to read and parse
> the log anyway, we might as well use a tool that makes that easy.
>
> Aggregation could happen either in a metering API based on the query, or
> an external app could retrieve a large dataset and manage the aggregation
> itself.
>
>
>> What I missed in the blueprint and seems to be appearing clearly now, is
>> that an event need to be able to carry the "object-reference" for which
>> it was collected, and this would seem highly necessary looking at the
>> messages in this thread. A metering event would essentially be defined
>> by (who, what, which) instead of a simple (who, what). As a consequence
>> we would need to extend the DB schema to add this [which/object
>> reference], and make sure that we carry it as well when we will work on
>> the message API format definition.
>>
>> How does this sound?
>>
>
> I think so. A lot of these sorts of issues can probably be fixed by being
> careful about how we define the measurements. For example, I may want to be
> able to show a customer the network bandwidth used per server, not just per
> network. If we measure the bandwidth consumed by each VIF, the aggregation
> code can take care of summarizing by network (because we know where the VIF
> is) and/or server (because we know which server has the VIF).
>
> We may need to record more detail than a simple "which," though, because
> it may be possible to change some information relevant for calculating the
> billing rate later. For example, a tenant can resize an instance, which
> would usually cause a change in the billing rate. Some of the relationships
> might change, too (Is it possible to move a VIF between networks?).
>
> At first I thought this might require separate table definitions per
> resource type (instance, network, etc.) but re-reading the table of
> counters in EfficientMetering I guess this is handled by measuring things
> like CPU, RAM, and block storage as separate counters? So a single event
> for creating a new instance might result in several records being written
> to the database, with the "which" set to the instance identifier. The data
> could then be presented as a unified "resource usage" report for that
> server.
>
> I think that works, but it may make the job of calculating the bill
> harder. We are planning to follow the model of specifying rates per size,
> so we would have to figure out which combination of CPU, RAM, and root
> volume storage matches up with a given size to determine the rate.
>
> Another piece I've been thinking about is handling boundary conditions
> when resource create and delete events don't both fall inside a billing
> cycle (or within the granularity of the metering system). That shouldn't be
> part of logging the events, necessarily, but it could be a reusable
> component that feeds into producing the aggregated data (either through the
> API, or as a way of processing the results returned by the API).
>
>> >>
>> > I agree. Would you like to go first ?
>>
>
> These are "things that might happen" use cases rather than "user stories,"
> but let's see where they take us:
>
> 1. User creates an instance, waits some period of time, then terminates it.
> - Vary the period of time to allow the events to both fall within the
> metering granularity window, to overlap an entire window, to start in one
> window and end in another.
> - The same variations for "billing cycle" instead of "metering
> granularity window."
> 2. User creates an instance, waits some period of time, then resizes it.
> - Vary the period of time as above.
> - Do we need variations for resizing up and down?
> 3. User creates an instance but it fails to create properly (provider
> issue).
> 4. User creates an instance but it fails to boot after creation (bad
> image).
> 5. User create volume storage, adds it to an existing instance, waits a
> period of time, then deletes the volume.
> - Vary the period of time as above.
> 6. User creates volume storage, adds it to an existing instance, waits a
> period of time, then terminates the instance (I'm not sure what happens to
> the volume in that case, maybe it still exists?)
>
> A provider-related story might be:
>
> 1. As a provider, I can query the metering API to determine the activity
> for a tenant within a given period of time.
>
> Although that's pretty vague. :-)
>

I thought of another provider story:

2. As a provider, I can install a metering plugin to start collecting data
about events not handled by the core metering app.


markmc at redhat

May 1, 2012, 9:13 AM

Post #14 of 38 (259 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

Hi Loic,

On Mon, 2012-04-30 at 12:15 +0200, Loic Dachary wrote:

> To prepare for the next meeting ( thursday 3rd, may 2012
> http://wiki.openstack.org/Meetings/MeteringAgenda ) I cleaned up and
> reorganized the Metering blueprint so that it ( hopefully )
> incorporates all the information temporarily stored in the etherpad
> ( http://etherpad.openstack.org/EfficientMetering revision 67 in case
> it is vandalized ).

I'm a bit late to the discussion, but some brief comments after reading
up on what you guys have done so far:

- big +1 on separating billing from metering; there's no need to
conflate the two problems and doing it this way will allow for a
bunch of different ideas to be tried around billing

- I'd prefer to avoid adding a new node agents, so +1 on building on
the notifications system

- I agree that we don't want to go too far with aggregation and lose
useful data like which instances have been running as opposed to
just how many instance minutes a given tenant has consumed

Another aspect of aggregation to think about is aggregation over
time - e.g. I might like to see my hourly network usage has varied
over the last week, or how my daily usage has varied over the last
month, but I probably don't care so much about my hourly usage on a
specific day 3 months ago

oVirt's equivalent of a metering service does this kind of
aggregation as follows:

http://www.ovirt.org/wiki/Ovirt_DWH

* Sample data is collected at the end of every minute and is
kept for up to 48 hours.
* Hourly level is aggregated every hour for the hour before
last and is kept for 2 months.
* Daily level is aggregated every day for the day before last
and is kept for 5 years.

- Lastly, bikeshed mode - since we're calling this "metering" and not
"counting", how about just using the term "meters" rather than
"counters"?

Cheers,
Mark.


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


acs at parvuscaptus

May 1, 2012, 12:57 PM

Post #15 of 38 (256 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

I'm glad to see people championing the effort to implement metering. Is
there someway to refocus the enthusiasm for solving the metering problem
into engineering a general solution in OpenStack?

I'm just going to apologize in advance, but I don't think this project is
headed in the right direction.

I believe metering should be a first class concern of OpenStack and the way
this project is starting is almost exactly backwards from what I think a
solution to metering should look like.

The last thing I want to see right now is a blessed OpenStack metering
project adding more agents, coupled to a particular db and making policy
decisions about what is quantifiable.

I think there are really three problems that need to be solved to do
metering, what data to get, getting the data and doing things with the data.

>From my perspective, a lot if not all of the data events should be coming
out of the services themselves. There is already a service that should know
when an instance gets started by what tenant. A cross cutting system for
publishing those events and a service definition for collecting them seems
like a reasonable place to start. To me that should look awful lot like a
message queue or centralized logging. Once the first two problems are
solved well, if you are so inclined to collect the data into a relational
model, the schema will be obvious.

If the first two problems are solved well, then I could be persuaded that a
service that provides some of the aggregation functionality is a great idea
and a reference implementation on a relational database isn't the worst
thing in the world.

Without a general solution for the first two problems, I believe the
primary focus on a schema and db is premature and sub-optimal. I also
believe the current approach likely results in a project that is generally
unusable.

Does anyone else share my perspective?

Maybe I'm the crazy one...

Andrew


loic at enovance

May 1, 2012, 1:41 PM

Post #16 of 38 (258 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On 05/01/2012 05:49 PM, Doug Hellmann wrote:
>
>
> On Tue, May 1, 2012 at 10:38 AM, Nick Barcet <nick.barcet [at] canonical <mailto:nick.barcet [at] canonical>> wrote:
>
> On 05/01/2012 02:23 AM, Loic Dachary wrote:
> > On 04/30/2012 11:39 PM, Doug Hellmann wrote:
> >>
> >>
> >> On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary <loic [at] enovance <mailto:loic [at] enovance>
> >> <mailto:loic [at] enovance <mailto:loic [at] enovance>>> wrote:
> >>
> >> On 04/30/2012 08:03 PM, Doug Hellmann wrote:
> >>>
> >>>
> >>> On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary <loic [at] enovance <mailto:loic [at] enovance>
> >>> <mailto:loic [at] enovance <mailto:loic [at] enovance>>> wrote:
> >>>
> >>> On 04/30/2012 03:49 PM, Doug Hellmann wrote:
> >>>>
> >>>>
> >>>> On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary
> >>>> <loic [at] enovance <mailto:loic [at] enovance> <mailto:loic [at] enovance <mailto:loic [at] enovance>>> wrote:
> >>>>
> >>>> On 04/30/2012 12:15 PM, Loic Dachary wrote:
> >>>> > We could start a discussion from the content of the
> >>>> following sections:
> >>>> >
> >>>> > http://wiki.openstack.org/EfficientMetering#Counters
> >>>> I think the rationale of the counter aggregation needs
> >>>> to be explained. My understanding is that the metering
> >>>> system will be able to deliver the following
> >>>> information: 10 floating IPv4 addresses were allocated
> >>>> to the tenant during three months and were leased from
> >>>> provider NNN. From this, the billing system could add a
> >>>> line to the invoice : 10 IPv4, $N each = $10xN because
> >>>> it has been configured to invoice each IPv4 leased from
> >>>> provider NNN for $N.
> >>>>
> >>>> It is not the purpose of the metering system to display
> >>>> each IPv4 used, therefore it only exposes the aggregated
> >>>> information. The counters define how the information
> >>>> should be aggregated. If the idea was to expose each
> >>>> resource usage individually, defining counters would be
> >>>> meaningless as they would duplicate the activity log
> >>>> from each OpenStack component.
> >>>>
> >>>> What do you think ?
> >>>>
> >>>>
> >>>> At DreamHost we are going to want to show each individual
> >>>> resource (the IPv4 address, the instance, etc.) along with
> >>>> the charge information. Having the metering system aggregate
> >>>> that data will make it difficult/impossible to present the
> >>>> bill summary and detail views that we want. It would be much
> >>>> more useful for us if it tracked the usage details for each
> >>>> resource, and let us aggregate the data ourselves.
> >>>>
> >>>> If other vendors want to show the data differently, perhaps
> >>>> we should provide separate APIs for retrieving the detailed
> >>>> and aggregate data.
> >>>>
> >>>> Doug
> >>>>
> >>> Hi,
> >>>
> >>> For the record, here is the unfinished conversation we had on IRC
> >>>
> >>> (04:29:06 PM) dhellmann: dachary, did you see my reply about
> >>> counter definitions on the list today?
> >>> (04:39:05 PM) dachary: It means some counters must not be
> >>> aggregated. Only the amount associated with it is but there
> >>> is one counter per IP.
> >>> (04:55:01 PM) dachary: dhellmann: what about this :the id of
> >>> the ressource controls the agregation of all counters : if it
> >>> is missing, all resources of the same kind and their measures
> >>> are aggregated. Otherwise only the measures are agreggated.
> >>> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39 <http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39>
> >>> <http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39 <http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39>>
> >>> (04:55:58 PM) dachary: it makes me a little unconfortable to
> >>> define such an "ad-hoc" grouping
> >>> (04:56:53 PM) dachary: i.e. you actuall control the
> >>> aggregation by chosing which value to put in the id column
> >>> (04:58:43 PM) dachary: s/actuall/actually/
> >>> (05:05:38 PM) ***dachary reading
> >>> http://www.ogf.org/documents/GFD.98.pdf
> >>> (05:05:54 PM) dachary: I feel like we're trying to resolve a
> >>> non problem here
> >>> (05:08:42 PM) dachary: values need to be aggregated. The raw
> >>> input is a full description of the resource and a value (
> >>> gauge ). The question is how to control the aggregation in a
> >>> reasonably flexible way.
> >>> (05:11:34 PM) dachary: The definition of a counter could
> >>> probably be described as : the id of a resource and code to
> >>> fill each column associated with it.
> >>>
> >>> I tried to append the following, but the wiki kept failing.
> >>>
> >>> Propose that the counters are defined by a function instead
> >>> of being fixed. That helps addressing the issue of
> >>> aggregating the bandwidth associated to a given IP into a
> >>> single counter.
> >>>
> >>> Alternate idea :
> >>> * a counter is defined by
> >>> * a name ( o1, n2, etc. ) that uniquely identifies the
> >>> nature of the measure ( outbound internet transit, amount of
> >>> RAM, etc. )
> >>> * the component in which it can be found ( nova, swift etc.)
> >>> * and by columns, each one is set with the result of
> >>> aggregate(find(record),record) where
> >>> * find() looks for the existing column as found by
> >>> selecting with the unique key ( maybe the name and the
> >>> resource id )
> >>> * record is a detailed description of the metering event to
> >>> be aggregated (
> >>> http://wiki.openstack.org/SystemUsageData#compute.instance.exists:
> >>> )
> >>> * the aggregate() function returns the updated row. By
> >>> default it just += the counter value with the old row
> >>> returned by find()
> >>>
> >>>
> >>> Would we want aggregation to occur within the database where we
> >>> are collecting events, or should that move somewhere else?
> >> I assume the events collected by the metering agents will all be
> >> archived for auditing (or re-building the database)
> >> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44 <http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44>
> >> <http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44 <http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44>>
> >>
> >> Therefore the aggregation should occur when the database is
> >> updated to account for a new event.
> >>
> >> Does this make sense ? I may have misunderstood part of your question.
> >>
> >>
> >> I guess what I don't understand is why the aggregated data is written
> >> back to the metering database at all. If it's in the same database, it
> >> seems like it should be in a different "table" (or equivalent) so the
> >> original data is left alone.
> > In my view the events are not stored in a database, they are merely
> > appended to a log file. The database is built from the events with
> > aggregated data. I now understand that you (and Joshua Harlow) think
> > it's better to not aggregate the data and let the billing system do this
> > job.
>
> My intent when writing the blueprint was that each event would be
> recorded atomically in the database, as it is the only way to control
> that we have not missed any. Aggregation, should be done at the external
> API level if the request is to get the sum of a given counter.
>
>
> That matches what I was thinking. The "log file" that Loic mentioned would in fact be a database that can handle a lot of writes. We could use some sort of simple file format, but since we're going to have to read and parse the log anyway, we might as well use a tool that makes that easy.
>
> Aggregation could happen either in a metering API based on the query, or an external app could retrieve a large dataset and manage the aggregation itself.
>
>
> What I missed in the blueprint and seems to be appearing clearly now, is
> that an event need to be able to carry the "object-reference" for which
> it was collected, and this would seem highly necessary looking at the
> messages in this thread. A metering event would essentially be defined
> by (who, what, which) instead of a simple (who, what). As a consequence
> we would need to extend the DB schema to add this [which/object
> reference], and make sure that we carry it as well when we will work on
> the message API format definition.
>
> How does this sound?
>
>
> I think so. A lot of these sorts of issues can probably be fixed by being careful about how we define the measurements. For example, I may want to be able to show a customer the network bandwidth used per server, not just per network. If we measure the bandwidth consumed by each VIF, the aggregation code can take care of summarizing by network (because we know where the VIF is) and/or server (because we know which server has the VIF).
>
> We may need to record more detail than a simple "which," though, because it may be possible to change some information relevant for calculating the billing rate later. For example, a tenant can resize an instance, which would usually cause a change in the billing rate. Some of the relationships might change, too (Is it possible to move a VIF between networks?).
>
> At first I thought this might require separate table definitions per resource type (instance, network, etc.) but re-reading the table of counters in EfficientMetering I guess this is handled by measuring things like CPU, RAM, and block storage as separate counters? So a single event for creating a new instance might result in several records being written to the database, with the "which" set to the instance identifier. The data could then be presented as a unified "resource usage" report for that server.
>
> I think that works, but it may make the job of calculating the bill harder. We are planning to follow the model of specifying rates per size, so we would have to figure out which combination of CPU, RAM, and root volume storage matches up with a given size to determine the rate.
The counters and storage description currently in the blueprint are easily extensible. Adding a new counter does not require a modification to a database format. This is a good. But this simplicity makes it more difficult to link the counters together : it would be much easier if the values related to an instance were in a table with the instance id as a key.

The approach of http://wiki.openstack.org/SystemUsageData is to fully describe the instance on each http://wiki.openstack.org/SystemUsageData#compute.instance.exists: event and they could be stored in a database table, almost as described. Aggregating the content of the table would then be more difficult because the structure of the table is specific to the resource and the sum() API function would need to be implemented for each resource type instead of relying on the unified format being presented in the blueprint.

To keep the simplicity of the counters descriptions, we could add a description of the resource to the database. The instance ID 84f84ea84 could be described as name=myinstance, flavor=m1.large etc. This instance description would be valid for a period of time, starting when the instance was created or when it is modified (name, flavor etc.).

Yet another approach would be to use the Usage Record Format Recommendation from http://www.ogf.org/documents/GFD.98.pdf . The messages collected from nova, swift etc. would be translated in this format. The implementation may not be too complex if it translates well in a document (a defined by mongodb) or a JSON object. The resulting structure would be more complex than the current counter definition but the variance of the structure definition could be less important because it is more mature than any structure we could imagine if we start to think about it just today. However, I'm not sure that it matches our requirements because it is written in a the context of a grid (i.e. it's more about distributed computing than cloud). We could, for instance, ignore the parts related to "jobs". And we could also advocate for the addition of a "Differentiated Proporty" ( chapter 12.) to account of the Disk I/O in addition to Disk Usage because.

>
> Another piece I've been thinking about is handling boundary conditions when resource create and delete events don't both fall inside a billing cycle (or within the granularity of the metering system). That shouldn't be part of logging the events, necessarily, but it could be a reusable component that feeds into producing the aggregated data (either through the API, or as a way of processing the results returned by the API).
Could you explain more about this ? I'm assuming you refer to, for instance, the following situation:

a) T : Public IP is allocated
b) T+10 : event reports 100 bytes sent since T
c) T+20 : event reports 500 bytes sent since T
d) T+30 : end billing cycle
e) T+40 : event reports 1000 bytes sent since T

When the billing cycle ends, we shoulld charge 50% of the bytes transfered between T+20 and T+40, hence 500 bytes.

Your answer will allow me to expand on the "things that might happen" below.

Cheers :-)
>
> >> Maybe it's time to start focusing these discussions on user stories?
> >>
> > I agree. Would you like to go first ?
>
>
> These are "things that might happen" use cases rather than "user stories," but let's see where they take us:
>
> 1. User creates an instance, waits some period of time, then terminates it.
> - Vary the period of time to allow the events to both fall within the metering granularity window, to overlap an entire window, to start in one window and end in another.
> - The same variations for "billing cycle" instead of "metering granularity window."
> 2. User creates an instance, waits some period of time, then resizes it.
> - Vary the period of time as above.
> - Do we need variations for resizing up and down?
> 3. User creates an instance but it fails to create properly (provider issue).
> 4. User creates an instance but it fails to boot after creation (bad image).
> 5. User create volume storage, adds it to an existing instance, waits a period of time, then deletes the volume.
> - Vary the period of time as above.
> 6. User creates volume storage, adds it to an existing instance, waits a period of time, then terminates the instance (I'm not sure what happens to the volume in that case, maybe it still exists?)
>
> A provider-related story might be:
>
> 1. As a provider, I can query the metering API to determine the activity for a tenant within a given period of time.
>
> Although that's pretty vague. :-)
>
> Doug
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp


--
Loc Dachary Chief Research Officer
// eNovance labs http://labs.enovance.com
// ? loic [at] enovance ? +33 1 49 70 99 82


loic at enovance

May 1, 2012, 2:05 PM

Post #17 of 38 (253 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On 05/01/2012 06:13 PM, Mark McLoughlin wrote:
> Hi Loic,
>
> On Mon, 2012-04-30 at 12:15 +0200, Loic Dachary wrote:
>
>> To prepare for the next meeting ( thursday 3rd, may 2012
>> http://wiki.openstack.org/Meetings/MeteringAgenda ) I cleaned up and
>> reorganized the Metering blueprint so that it ( hopefully )
>> incorporates all the information temporarily stored in the etherpad
>> ( http://etherpad.openstack.org/EfficientMetering revision 67 in case
>> it is vandalized ).
> I'm a bit late to the discussion, but some brief comments after reading
> up on what you guys have done so far:
>
> - big +1 on separating billing from metering; there's no need to
> conflate the two problems and doing it this way will allow for a
> bunch of different ideas to be tried around billing
>
> - I'd prefer to avoid adding a new node agents, so +1 on building on
> the notifications system
I would also prefer this option. I have a few concerns though:

a) adding too many messages to the existing message queues
b) not all core components provide notifications
c) convincing all components to agree on a unified approach to metering

Instead it might be more practical to implement node agents when necessary to complete a first implementation. That is, taking advice from core component developers and possibly run into problems as opposed to convincing core component developers to adopt an approach to metering that is not yet implemented anywhere.
>
> - I agree that we don't want to go too far with aggregation and lose
> useful data like which instances have been running as opposed to
> just how many instance minutes a given tenant has consumed
>
> Another aspect of aggregation to think about is aggregation over
> time - e.g. I might like to see my hourly network usage has varied
> over the last week, or how my daily usage has varied over the last
> month, but I probably don't care so much about my hourly usage on a
> specific day 3 months ago
>
> oVirt's equivalent of a metering service does this kind of
> aggregation as follows:
>
> http://www.ovirt.org/wiki/Ovirt_DWH
>
> * Sample data is collected at the end of every minute and is
> kept for up to 48 hours.
> * Hourly level is aggregated every hour for the hour before
> last and is kept for 2 months.
> * Daily level is aggregated every day for the day before last
> and is kept for 5 years.
Where can I read a description of the corresponding database ?
>
> - Lastly, bikeshed mode - since we're calling this "metering" and not
> "counting", how about just using the term "meters" rather than
> "counters"?
>
+1 ;-)

Cheers

--
Loïc Dachary Chief Research Officer
// eNovance labs http://labs.enovance.com
// ✉ loic [at] enovance ☎ +33 1 49 70 99 82


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


markmc at redhat

May 1, 2012, 10:19 PM

Post #18 of 38 (256 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

Hey,

On Tue, 2012-05-01 at 23:05 +0200, Loic Dachary wrote:
> On 05/01/2012 06:13 PM, Mark McLoughlin wrote:
> > Hi Loic,
> >
> > On Mon, 2012-04-30 at 12:15 +0200, Loic Dachary wrote:
> > - I agree that we don't want to go too far with aggregation and lose
> > useful data like which instances have been running as opposed to
> > just how many instance minutes a given tenant has consumed
> >
> > Another aspect of aggregation to think about is aggregation over
> > time - e.g. I might like to see my hourly network usage has varied
> > over the last week, or how my daily usage has varied over the last
> > month, but I probably don't care so much about my hourly usage on a
> > specific day 3 months ago
> >
> > oVirt's equivalent of a metering service does this kind of
> > aggregation as follows:
> >
> > http://www.ovirt.org/wiki/Ovirt_DWH
> >
> > * Sample data is collected at the end of every minute and is
> > kept for up to 48 hours.
> > * Hourly level is aggregated every hour for the hour before
> > last and is kept for 2 months.
> > * Daily level is aggregated every day for the day before last
> > and is kept for 5 years.
> Where can I read a description of the corresponding database ?

Here FWIW: http://goo.gl/3Bqct

Cheers,
Mark.


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


markmc at redhat

May 1, 2012, 10:39 PM

Post #19 of 38 (253 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On Tue, 2012-05-01 at 23:05 +0200, Loic Dachary wrote:
> On 05/01/2012 06:13 PM, Mark McLoughlin wrote:
> > Hi Loic,
> >
> > On Mon, 2012-04-30 at 12:15 +0200, Loic Dachary wrote:
> >
> >> To prepare for the next meeting ( thursday 3rd, may 2012
> >> http://wiki.openstack.org/Meetings/MeteringAgenda ) I cleaned up and
> >> reorganized the Metering blueprint so that it ( hopefully )
> >> incorporates all the information temporarily stored in the etherpad
> >> ( http://etherpad.openstack.org/EfficientMetering revision 67 in case
> >> it is vandalized ).
> > I'm a bit late to the discussion, but some brief comments after reading
> > up on what you guys have done so far:
> >
> > - big +1 on separating billing from metering; there's no need to
> > conflate the two problems and doing it this way will allow for a
> > bunch of different ideas to be tried around billing
> >
> > - I'd prefer to avoid adding a new node agents, so +1 on building on
> > the notifications system
> I would also prefer this option. I have a few concerns though:
>
> a) adding too many messages to the existing message queues
> b) not all core components provide notifications
> c) convincing all components to agree on a unified approach to metering
>
> Instead it might be more practical to implement node agents when
> necessary to complete a first implementation. That is, taking advice
> from core component developers and possibly run into problems as
> opposed to convincing core component developers to adopt an approach
> to metering that is not yet implemented anywhere.

I'd start with metering using the notifications which are already there.
I think that will get us a long way.

My impression is that the notifications system is intended to cover all
billable usage in at least Nova and Glance.

Cheers,
Mark.


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


loic at enovance

May 2, 2012, 12:27 AM

Post #20 of 38 (256 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On 05/02/2012 07:19 AM, Mark McLoughlin wrote:
> Hey,
>
> On Tue, 2012-05-01 at 23:05 +0200, Loic Dachary wrote:
>> On 05/01/2012 06:13 PM, Mark McLoughlin wrote:
>>> Hi Loic,
>>>
>>> On Mon, 2012-04-30 at 12:15 +0200, Loic Dachary wrote:
>>> - I agree that we don't want to go too far with aggregation and lose
>>> useful data like which instances have been running as opposed to
>>> just how many instance minutes a given tenant has consumed
>>>
>>> Another aspect of aggregation to think about is aggregation over
>>> time - e.g. I might like to see my hourly network usage has varied
>>> over the last week, or how my daily usage has varied over the last
>>> month, but I probably don't care so much about my hourly usage on a
>>> specific day 3 months ago
>>>
>>> oVirt's equivalent of a metering service does this kind of
>>> aggregation as follows:
>>>
>>> http://www.ovirt.org/wiki/Ovirt_DWH
>>>
>>> * Sample data is collected at the end of every minute and is
>>> kept for up to 48 hours.
>>> * Hourly level is aggregated every hour for the hour before
>>> last and is kept for 2 months.
>>> * Daily level is aggregated every day for the day before last
>>> and is kept for 5 years.
>> Where can I read a description of the corresponding database ?
> Here FWIW: http://goo.gl/3Bqct
Thanks : http://wiki.openstack.org/EfficientMetering?action=diff&rev2=49&rev1=48
>
> Cheers,
> Mark.
>


--
Loïc Dachary Chief Research Officer
// eNovance labs http://labs.enovance.com
// ✉ loic [at] enovance ☎ +33 1 49 70 99 82


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


loic at enovance

May 2, 2012, 1:08 AM

Post #21 of 38 (255 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On 05/02/2012 07:39 AM, Mark McLoughlin wrote:
> On Tue, 2012-05-01 at 23:05 +0200, Loic Dachary wrote:
>> On 05/01/2012 06:13 PM, Mark McLoughlin wrote:
>>> Hi Loic,
>>>
>>> On Mon, 2012-04-30 at 12:15 +0200, Loic Dachary wrote:
>>>
>>>> To prepare for the next meeting ( thursday 3rd, may 2012
>>>> http://wiki.openstack.org/Meetings/MeteringAgenda ) I cleaned up and
>>>> reorganized the Metering blueprint so that it ( hopefully )
>>>> incorporates all the information temporarily stored in the etherpad
>>>> ( http://etherpad.openstack.org/EfficientMetering revision 67 in case
>>>> it is vandalized ).
>>> I'm a bit late to the discussion, but some brief comments after reading
>>> up on what you guys have done so far:
>>>
>>> - big +1 on separating billing from metering; there's no need to
>>> conflate the two problems and doing it this way will allow for a
>>> bunch of different ideas to be tried around billing
>>>
>>> - I'd prefer to avoid adding a new node agents, so +1 on building on
>>> the notifications system
>> I would also prefer this option. I have a few concerns though:
>>
>> a) adding too many messages to the existing message queues
>> b) not all core components provide notifications
>> c) convincing all components to agree on a unified approach to metering
>>
>> Instead it might be more practical to implement node agents when
>> necessary to complete a first implementation. That is, taking advice
>> from core component developers and possibly run into problems as
>> opposed to convincing core component developers to adopt an approach
>> to metering that is not yet implemented anywhere.
> I'd start with metering using the notifications which are already there.
> I think that will get us a long way.
I've started a thread to check there is all we need and if not to figure out how it can be modified.
>
> My impression is that the notifications system is intended to cover all
> billable usage in at least Nova and Glance.
It's also my understanding. Regarding swift, how would you suggest we approach the problem ? I see two possible courses:

a) directly create something similar to nova http://wiki.openstack.org/SystemUsageData for swift (i.e. a swift blueprint and coding in swift ) so that there is no need to install a metering agent for swift
b) create a swift plugin for a metering agent and when it proves useful, port it to swift so that it is integrated and there is no longer a need for a metering agent plugin

What do you think ?

--
Loïc Dachary Chief Research Officer
// eNovance labs http://labs.enovance.com
// ✉ loic [at] enovance ☎ +33 1 49 70 99 82


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


markmc at redhat

May 2, 2012, 1:19 AM

Post #22 of 38 (257 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On Wed, 2012-05-02 at 10:08 +0200, Loic Dachary wrote:
> > My impression is that the notifications system is intended to cover
> all
> > billable usage in at least Nova and Glance.
> It's also my understanding. Regarding swift, how would you suggest we
> approach the problem ? I see two possible courses:
>
> a) directly create something similar to nova
> http://wiki.openstack.org/SystemUsageData for swift (i.e. a swift
> blueprint and coding in swift ) so that there is no need to install a
> metering agent for swift
> b) create a swift plugin for a metering agent and when it proves
> useful, port it to swift so that it is integrated and there is no
> longer a need for a metering agent plugin
>
> What do you think ?

I've no informed opinion on Swift, but I assume Swift is amenable to
work which helps with metering its resources

Cheers,
Mark.


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


david.kranz at qrclab

May 2, 2012, 5:13 AM

Post #23 of 38 (255 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

There was a swift talk at the design summit that is related to (a):
http://etherpad.openstack.org/FolsomSwiftStatsd.
There is a good summary in the referenced blog post:
http://www.swiftstack.com/blog/2012/04/11/swift-monitoring-with-statsd/

-David

On 5/2/2012 4:19 AM, Mark McLoughlin wrote:
> On Wed, 2012-05-02 at 10:08 +0200, Loic Dachary wrote:
>>> My impression is that the notifications system is intended to cover
>> all
>>> billable usage in at least Nova and Glance.
>> It's also my understanding. Regarding swift, how would you suggest we
>> approach the problem ? I see two possible courses:
>>
>> a) directly create something similar to nova
>> http://wiki.openstack.org/SystemUsageData for swift (i.e. a swift
>> blueprint and coding in swift ) so that there is no need to install a
>> metering agent for swift
>> b) create a swift plugin for a metering agent and when it proves
>> useful, port it to swift so that it is integrated and there is no
>> longer a need for a metering agent plugin
>>
>> What do you think ?
> I've no informed opinion on Swift, but I assume Swift is amenable to
> work which helps with metering its resources
>
> Cheers,
> Mark.
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


Whit.Turner at hp

May 3, 2012, 10:27 AM

Post #24 of 38 (242 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

Hi - I think a flexible aggregation scheme is needed; the levels of aggregation available should be definable in the meter independent of the sources of usage data themselves. If invoices need to be very granular down to the lowest possible level, then this drives higher data requirements all through the processing chain, including the rating engine. Traditional systems tend to pass less granular (more highly aggregated) data into the rating engine so that bill runs and invoices can be generated efficiently. At cloud-scale, this can be problematic. Given some “big data” approaches, though, this could be handled in a more granular and real-time fashion.



Regards



Whit



From: openstack-bounces+whit.turner=hp.com [at] lists [mailto:openstack-bounces+whit.turner=hp.com [at] lists] On Behalf Of Doug Hellmann
Sent: Monday, April 30, 2012 2:03 PM
To: Loic Dachary
Cc: openstack [at] lists
Subject: Re: [Openstack] [Metering] schema and counter definitions





On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary <loic [at] enovance> wrote:

On 04/30/2012 03:49 PM, Doug Hellmann wrote:



On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary <loic [at] enovance> wrote:

On 04/30/2012 12:15 PM, Loic Dachary wrote:
> We could start a discussion from the content of the following sections:
>
> http://wiki.openstack.org/EfficientMetering#Counters

I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N.

It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component.

What do you think ?



At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves.



If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data.



Doug



Hi,

For the record, here is the unfinished conversation we had on IRC

(04:29:06 PM) dhellmann: dachary, did you see my reply about counter definitions on the list today?
(04:39:05 PM) dachary: It means some counters must not be aggregated. Only the amount associated with it is but there is one counter per IP.
(04:55:01 PM) dachary: dhellmann: what about this :the id of the ressource controls the agregation of all counters : if it is missing, all resources of the same kind and their measures are aggregated. Otherwise only the measures are agreggated. http://wiki.openstack.org/EfficientMetering?action=diff <http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39> &rev2=40&rev1=39
(04:55:58 PM) dachary: it makes me a little unconfortable to define such an "ad-hoc" grouping
(04:56:53 PM) dachary: i.e. you actuall control the aggregation by chosing which value to put in the id column
(04:58:43 PM) dachary: s/actuall/actually/
(05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf
(05:05:54 PM) dachary: I feel like we're trying to resolve a non problem here
(05:08:42 PM) dachary: values need to be aggregated. The raw input is a full description of the resource and a value ( gauge ). The question is how to control the aggregation in a reasonably flexible way.
(05:11:34 PM) dachary: The definition of a counter could probably be described as : the id of a resource and code to fill each column associated with it.

I tried to append the following, but the wiki kept failing.

Propose that the counters are defined by a function instead of being fixed. That helps addressing the issue of aggregating the bandwidth associated to a given IP into a single counter.

Alternate idea :
* a counter is defined by
* a name ( o1, n2, etc. ) that uniquely identifies the nature of the measure ( outbound internet transit, amount of RAM, etc. )
* the component in which it can be found ( nova, swift etc.)
* and by columns, each one is set with the result of aggregate(find(record),record) where
* find() looks for the existing column as found by selecting with the unique key ( maybe the name and the resource id )
* record is a detailed description of the metering event to be aggregated ( http://wiki.openstack.org/SystemUsageData#compute.instance.exists: )
* the aggregate() function returns the updated row. By default it just += the counter value with the old row returned by find()



Would we want aggregation to occur within the database where we are collecting events, or should that move somewhere else?





Cheers



--
Loïc Dachary Chief Research Officer
// eNovance labs http://labs.enovance.com
// ✉ loic [at] enovance ☎ +33 1 49 70 99 82 <tel:%2B33%201%2049%2070%2099%2082>
Attachments: smime.p7s (6.11 KB)


robert.collins at canonical

May 3, 2012, 1:42 PM

Post #25 of 38 (244 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On Fri, May 4, 2012 at 5:27 AM, Turner, Whit (Cloud Services)
<Whit.Turner [at] hp> wrote:
> Hi - I think a flexible aggregation scheme is needed; the levels of
> aggregation available should be definable in the meter independent of the
> sources of usage data themselves. If invoices need to be very granular down
> to the lowest possible level, then this drives higher data requirements all
> through the processing chain, including the rating engine. Traditional
> systems tend to pass less granular (more highly aggregated) data into the
> rating engine so that bill runs and invoices can be generated efficiently.
> At cloud-scale, this can be problematic. Given some big data approaches,
> though, this could be handled in a more granular and real-time fashion.

Has anyone looked at what statsd does? It has very similar
requirements (simple to use, no hard a-priori definition of things to
count, a few base types to track), and needs to be horizontally
scalable.

We could, as a riff on my prior email, define the statsd (or a similar
thing) as a common substrate, and then let different implementations
discard detail, or preserve it as needed. The key difference I see vs
defining a Python API is that if someone is writing a different
language implementation of an Openstack component, they would have a
common thing to target.

OTOH it should be trivial to write a network component that thunks
*into* the stock Python API, and from there to the configured backend,
so there is no need to pick any specific network protocol up front -
though bearing in mind that we want network handoffs is probably a
good thing when looking at the nitty gritty.

-Rob

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


thierry at openstack

May 4, 2012, 2:50 AM

Post #26 of 38 (182 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

Robert Collins wrote:
> On Fri, May 4, 2012 at 5:27 AM, Turner, Whit (Cloud Services)
> <Whit.Turner [at] hp> wrote:
>> Hi - I think a flexible aggregation scheme is needed; the levels of
>> aggregation available should be definable in the meter independent of the
>> sources of usage data themselves. If invoices need to be very granular down
>> to the lowest possible level, then this drives higher data requirements all
>> through the processing chain, including the rating engine. Traditional
>> systems tend to pass less granular (more highly aggregated) data into the
>> rating engine so that bill runs and invoices can be generated efficiently.
>> At cloud-scale, this can be problematic. Given some big data approaches,
>> though, this could be handled in a more granular and real-time fashion.
>
> Has anyone looked at what statsd does? It has very similar
> requirements (simple to use, no hard a-priori definition of things to
> count, a few base types to track), and needs to be horizontally
> scalable.

Also Swift has plans to use statsd for instrumentation/monitoring, so
it's definitely worth a look to see if it could be used here as well.

http://folsomdesignsummit2012.sched.org/event/d9135eabdd775432c74c3f1d32a325d3
http://etherpad.openstack.org/FolsomSwiftStatsd

--
Thierry Carrez (ttx)
Release Manager, OpenStack

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


nick.barcet at canonical

May 4, 2012, 10:43 AM

Post #27 of 38 (185 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

-----Original Message-----
From: Thierry Carrez <thierry [at] openstack>
To: openstack [at] lists
Sent: Fri, 04 May 2012 2:50
Subject: Re: [Openstack] [Metering] schema and counter definitions

Robert Collins wrote:
> On Fri, May 4, 2012 at 5:27 AM, Turner, Whit (Cloud Services)
> <Whit.Turner [at] hp> wrote:
>> Hi - I think a flexible aggregation scheme is needed; the levels of
>> aggregation available should be definable in the meter independent of the
>> sources of usage data themselves. If invoices need to be very granular down
>> to the lowest possible level, then this drives higher data requirements all
>> through the processing chain, including the rating engine. Traditional
>> systems tend to pass less granular (more highly aggregated) data into the
>> rating engine so that bill runs and invoices can be generated efficiently.
>> At cloud-scale, this can be problematic. Given some “big data” approaches,
>> though, this could be handled in a more granular and real-time fashion.
>
> Has anyone looked at what statsd does? It has very similar
> requirements (simple to use, no hard a-priori definition of things to
> count, a few base types to track), and needs to be horizontally
> scalable.

Also Swift has plans to use statsd for instrumentation/monitoring, so
it's definitely worth a look to see if it could be used here as well.

http://folsomdesignsummit2012.sched.org/event/d9135eabdd775432c74c3f1d32a325d3
http://etherpad.openstack.org/FolsomSwiftStatsd

--
Thierry Carrez (ttx)
Release Manager, OpenStack

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


nick.barcet at canonical

May 4, 2012, 12:11 PM

Post #28 of 38 (185 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On 05/04/2012 02:50 AM, Thierry Carrez wrote:
> Robert Collins wrote:
>> On Fri, May 4, 2012 at 5:27 AM, Turner, Whit (Cloud Services)
>> <Whit.Turner [at] hp> wrote:
>>> Hi - I think a flexible aggregation scheme is needed; the levels of
>>> aggregation available should be definable in the meter independent of the
>>> sources of usage data themselves. If invoices need to be very granular down
>>> to the lowest possible level, then this drives higher data requirements all
>>> through the processing chain, including the rating engine. Traditional
>>> systems tend to pass less granular (more highly aggregated) data into the
>>> rating engine so that bill runs and invoices can be generated efficiently.
>>> At cloud-scale, this can be problematic. Given some big data approaches,
>>> though, this could be handled in a more granular and real-time fashion.
>>
>> Has anyone looked at what statsd does? It has very similar
>> requirements (simple to use, no hard a-priori definition of things to
>> count, a few base types to track), and needs to be horizontally
>> scalable.
>
> Also Swift has plans to use statsd for instrumentation/monitoring, so
> it's definitely worth a look to see if it could be used here as well.
>
> http://folsomdesignsummit2012.sched.org/event/d9135eabdd775432c74c3f1d32a325d3
> http://etherpad.openstack.org/FolsomSwiftStatsd

I am no Stastd expert, but a quick look at the project shows that it is
aimed add data collection for the requirements of monitoring, and uses
UDP as a way to aggregate vast quantity of data at short interval. The
use of UDP implies that delivery is not guaranteed, which is fines for
the objectives of monitoring, but is conflicting with the requirements
of metering (as a sub component of a billing system).

Stastd does not seem either to allow for message signature and
authentication of collectors.

Here are the requirements I think we have:
* The data is sent from agents to the storage daemon via a trusted
messaging system
* The messages in queue are signed and non repudiable
* The agents collecting data are authenticated to avoid pollution of
the metering service

Nick
Attachments: signature.asc (0.88 KB)


loic at enovance

May 4, 2012, 3:17 PM

Post #29 of 38 (185 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On 05/04/2012 11:50 AM, Thierry Carrez wrote:
> Robert Collins wrote:
>> On Fri, May 4, 2012 at 5:27 AM, Turner, Whit (Cloud Services)
>> <Whit.Turner [at] hp> wrote:
>>> Hi - I think a flexible aggregation scheme is needed; the levels of
>>> aggregation available should be definable in the meter independent of the
>>> sources of usage data themselves. If invoices need to be very granular down
>>> to the lowest possible level, then this drives higher data requirements all
>>> through the processing chain, including the rating engine. Traditional
>>> systems tend to pass less granular (more highly aggregated) data into the
>>> rating engine so that bill runs and invoices can be generated efficiently.
>>> At cloud-scale, this can be problematic. Given some “big data” approaches,
>>> though, this could be handled in a more granular and real-time fashion.
>> Has anyone looked at what statsd does? It has very similar
>> requirements (simple to use, no hard a-priori definition of things to
>> count, a few base types to track), and needs to be horizontally
>> scalable.
> Also Swift has plans to use statsd for instrumentation/monitoring, so
> it's definitely worth a look to see if it could be used here as well.
>
> http://folsomdesignsummit2012.sched.org/event/d9135eabdd775432c74c3f1d32a325d3
> http://etherpad.openstack.org/FolsomSwiftStatsd
>
Thanks :-) Just saved the etherpad as
http://etherpad.openstack.org/ep/pad/view/FolsomSwiftStatsd/9cy8Uxtp2U
in case it is vandalized.

Cheers


--
Loïc Dachary Chief Research Officer
// eNovance labs http://labs.enovance.com
// ✉ loic [at] enovance ☎ +33 1 49 70 99 82


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


ss7pro at gmail

May 6, 2012, 1:36 PM

Post #30 of 38 (176 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

Hi,

I'd like to share my thoughts on metering openstack resources usage.
Except those data available from SystemUsageData and those mentioned
in blueprints some of the cloud providers charge for I/O on disk
drives (to prevent users from dd if=/dev/zero nosense and to teach
them properly implementing cache strategies). Those data along with
network card usage can be gathered from libvirt using domblkstat and
domifstat. My idea is to gather data using agent (or modified
nova-compute) and send them to messaging queue using following jsoned
data schema:

{'instance': 'instance-0000003b',
'host':'tytan-1','zone':'r4cz1','counters': {'interface': {'vnet0':
(80796L, 1212L, 0L, 0L, 53403L, 621L, 0L, 0L)}, 'disk': {'vda': (629L,
11699200L, 58L, 219136L, 0L)}}}

interface is a result of: interfaceStats(), disk is a result of: blockStats()


Those messages are consumed from queue and stored in mysql tables. I
assume that instance is a parent resource for each disk (ephemeral or
volume) and for each network interface.

So for message mentioned earlier we have three resources:

1) instances resource: instance-0000003b
2) child resource: vnet0 (network)
3) child resource: vda 'ephemeral


Mysql table for resources is like below:

CREATE TABLE resources (
id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
parent BIGINT UNSIGNED,
type VARCHAR(255) NOT NULL,
value VARCHAR(255) NOT NULL,
zone VARCHAR(255) NOT NULL,
added TIMESTAMP,
) ENGINE=INNODB;

Counters are stored also in Mysql using following table:


CREATE TABLE counters (
resource BIGINT UNSIGNED NOT NULL,
type VARCHAR(255) NOT NULL,
value BIGINT UNSIGNED,
delta BIGINT UNISGNED,
added TIMESTAMP NOT NULL,
prev TIMESTAMP NOT NULL,
) ENGINE=INNODB;

Where prev is a reference to previous counter value. Process which is
reading data from queue is puting raw counter value into the table and
if possible (reference to previous entry present) evaluates delta
value.


By using this model of stroing usage counter's it very easy for
billing system to evaluate charges. We just run SUM(delta) on each
counter for given time range.

This model could be very easy adopted to other counters (IP Traffic
external/internal counters from iptables).





On Mon, Apr 30, 2012 at 12:15 PM, Loic Dachary <loic [at] enovance> wrote:
> Hi,
>
> To prepare for the next meeting ( thursday 3rd, may 2012 http://wiki.openstack.org/Meetings/MeteringAgenda ) I cleaned up and reorganized the Metering blueprint so that it ( hopefully ) incorporates all the information temporarily stored in the etherpad ( http://etherpad.openstack.org/EfficientMetering revision 67 in case it is vandalized ).
>
> We could start a discussion from the content of the following sections:
>
> http://wiki.openstack.org/EfficientMetering#Counters
> http://wiki.openstack.org/EfficientMetering#Storage
>
> and come up with a list of the counters that should exist by default and how they should be stored.
>
> This morning we had a discussion with Zhongyue Luo on irc.freenode.net#openstack-metering about how Dough could use the metering service. Since it already knows about instance creations, counter c1 that records how long a given instance was up is of no interest. However, other counters such as the external bandwidth used would be useful. I advocated that one of the advantages for Dough to rely on metering to collect counters is that it does not need to know about each OpenStack component and can rely on metering to figure out how to extract such counters from nova-compute, nova-network soon to be quantum, nova-volume soon to be cinder, swift, glance and free it from the burden of tracking structural changes.
>
> Cheers
>
> --
> Loïc Dachary         Chief Research Officer
> // eNovance labs   http://labs.enovance.com
> // ✉ loic [at] enovance  ☎ +33 1 49 70 99 82
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp



--
Tomasz Paszkowski
SS7, Asterisk, SAN, Datacenter, Cloud Computing
+48500166299

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


loic at enovance

May 7, 2012, 8:21 AM

Post #31 of 38 (171 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On 05/06/2012 10:36 PM, Tomasz Paszkowski wrote:
> Hi,
>
> I'd like to share my thoughts on metering openstack resources usage.
> Except those data available from SystemUsageData and those mentioned
> in blueprints some of the cloud providers charge for I/O on disk
> drives (to prevent users from dd if=/dev/zero nosense and to teach
> them properly implementing cache strategies).
Hi Tomasz,

I could not agree more and this is the reason why I/O shows in the list of meters shown in http://wiki.openstack.org/EfficientMetering (c5) "disk IO in megabyte per second has a high impact on the service availability and could be billed separately ".


> Those data along with
> network card usage can be gathered from libvirt using domblkstat and
> domifstat.
Thanks for the hint
http://wiki.openstack.org/EfficientMetering?action=diff&rev2=60&rev1=59
> My idea is to gather data using agent (or modified
> nova-compute) and send them to messaging queue using following jsoned
> data schema:
>
> {'instance': 'instance-0000003b',
> 'host':'tytan-1','zone':'r4cz1','counters': {'interface': {'vnet0':
> (80796L, 1212L, 0L, 0L, 53403L, 621L, 0L, 0L)}, 'disk': {'vda': (629L,
> 11699200L, 58L, 219136L, 0L)}}}
>
> interface is a result of: interfaceStats(), disk is a result of: blockStats()
>
>
> Those messages are consumed from queue and stored in mysql tables. I
> assume that instance is a parent resource for each disk (ephemeral or
> volume) and for each network interface.
>
> So for message mentioned earlier we have three resources:
>
> 1) instances resource: instance-0000003b
> 2) child resource: vnet0 (network)
> 3) child resource: vda 'ephemeral
>
>
> Mysql table for resources is like below:
>
> CREATE TABLE resources (
> id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
> parent BIGINT UNSIGNED,
> type VARCHAR(255) NOT NULL,
> value VARCHAR(255) NOT NULL,
> zone VARCHAR(255) NOT NULL,
> added TIMESTAMP,
> ) ENGINE=INNODB;
>
> Counters are stored also in Mysql using following table:
>
>
> CREATE TABLE counters (
> resource BIGINT UNSIGNED NOT NULL,
> type VARCHAR(255) NOT NULL,
> value BIGINT UNSIGNED,
> delta BIGINT UNISGNED,
> added TIMESTAMP NOT NULL,
> prev TIMESTAMP NOT NULL,
> ) ENGINE=INNODB;
>
> Where prev is a reference to previous counter value. Process which is
> reading data from queue is puting raw counter value into the table and
> if possible (reference to previous entry present) evaluates delta
> value.
>
>
> By using this model of stroing usage counter's it very easy for
> billing system to evaluate charges. We just run SUM(delta) on each
> counter for given time range.
>
> This model could be very easy adopted to other counters (IP Traffic
> external/internal counters from iptables).
>

It looks like you already have a codebase that could be useful for the metering implementation. Would you be willing to share it ?

Cheers
>
>
>
> On Mon, Apr 30, 2012 at 12:15 PM, Loic Dachary <loic [at] enovance> wrote:
>> Hi,
>>
>> To prepare for the next meeting ( thursday 3rd, may 2012 http://wiki.openstack.org/Meetings/MeteringAgenda ) I cleaned up and reorganized the Metering blueprint so that it ( hopefully ) incorporates all the information temporarily stored in the etherpad ( http://etherpad.openstack.org/EfficientMetering revision 67 in case it is vandalized ).
>>
>> We could start a discussion from the content of the following sections:
>>
>> http://wiki.openstack.org/EfficientMetering#Counters
>> http://wiki.openstack.org/EfficientMetering#Storage
>>
>> and come up with a list of the counters that should exist by default and how they should be stored.
>>
>> This morning we had a discussion with Zhongyue Luo on irc.freenode.net#openstack-metering about how Dough could use the metering service. Since it already knows about instance creations, counter c1 that records how long a given instance was up is of no interest. However, other counters such as the external bandwidth used would be useful. I advocated that one of the advantages for Dough to rely on metering to collect counters is that it does not need to know about each OpenStack component and can rely on metering to figure out how to extract such counters from nova-compute, nova-network soon to be quantum, nova-volume soon to be cinder, swift, glance and free it from the burden of tracking structural changes.
>>
>> Cheers
>>
>> --
>> Loïc Dachary Chief Research Officer
>> // eNovance labs http://labs.enovance.com
>> // ✉ loic [at] enovance ☎ +33 1 49 70 99 82
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack
>> Post to : openstack [at] lists
>> Unsubscribe : https://launchpad.net/~openstack
>> More help : https://help.launchpad.net/ListHelp
>
>


--
Loïc Dachary Chief Research Officer
// eNovance labs http://labs.enovance.com
// ✉ loic [at] enovance ☎ +33 1 49 70 99 82


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


ss7pro at gmail

May 7, 2012, 12:25 PM

Post #32 of 38 (171 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On Mon, May 7, 2012 at 5:21 PM, Loic Dachary <loic [at] enovance> wrote:
> Hi Tomasz,
Hi

>
> I could not agree more and this is the reason why I/O shows in the list of meters shown in http://wiki.openstack.org/EfficientMetering (c5) "disk IO in megabyte per second has a high impact on the service availability and could be billed separately ".

Yes but for disk drives I/O (number of read/write ops) are the key
resource usage information. It's very hard to setup a billing model
for disk drive usage on bandwidth as low bandwidth disk operations
(small random read/writes) can utilize disk drive more than huge
sequential reads/writes. I need also to mention that AWS is also
charging for I/O in their volume service.

>
>
> It looks like you already have a codebase that could be useful for the metering implementation. Would you be willing to share it ?

Yes. Just give me few days.

--
Tomasz Paszkowski
SS7, Asterisk, SAN, Datacenter, Cloud Computing
+48500166299

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


ss7pro at gmail

May 9, 2012, 9:42 AM

Post #33 of 38 (162 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

Here is the simplified version of my code (without ampq support,
counter stored directly to mysql db).

https://github.com/ss7pro/rescnt

Code is started from main.py which is constantly collecting counters
from libvirt and storing them in a mysql database.


On Mon, May 7, 2012 at 9:25 PM, Tomasz Paszkowski <ss7pro [at] gmail> wrote:
> On Mon, May 7, 2012 at 5:21 PM, Loic Dachary <loic [at] enovance> wrote:
>> Hi Tomasz,
> Hi
>
>>
>> I could not agree more and this is the reason why I/O shows in the list of meters shown in http://wiki.openstack.org/EfficientMetering (c5) "disk IO in megabyte per second has a high impact on the service availability and could be billed separately ".
>
> Yes but for disk drives I/O (number of read/write ops) are the key
> resource usage information. It's very hard to setup a billing model
> for disk drive usage on bandwidth as low bandwidth disk operations
> (small random read/writes) can utilize disk drive more than huge
> sequential reads/writes. I need also to mention that AWS is also
> charging for I/O in their volume service.
>
>>
>>
>> It looks like you already have a codebase that could be useful for the metering implementation. Would you be willing to share it ?
>
> Yes. Just give me few days.
>
> --
> Tomasz Paszkowski
> SS7, Asterisk, SAN, Datacenter, Cloud Computing
> +48500166299



--
Tomasz Paszkowski
SS7, Asterisk, SAN, Datacenter, Cloud Computing
+48500166299

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


doug.hellmann at dreamhost

May 9, 2012, 11:02 AM

Post #34 of 38 (162 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On Wed, May 9, 2012 at 12:42 PM, Tomasz Paszkowski <ss7pro [at] gmail> wrote:

> Here is the simplified version of my code (without ampq support,
> counter stored directly to mysql db).
>
> https://github.com/ss7pro/rescnt
>
> Code is started from main.py which is constantly collecting counters
> from libvirt and storing them in a mysql database.
>

Nice!

For production code I think we are going to want to separate collection
from storage, aren't we? We don't want each compute node to require access
to the database server (that's an issue with nova that they are trying to
fix during the folsom release, IIRC).


>
>
> On Mon, May 7, 2012 at 9:25 PM, Tomasz Paszkowski <ss7pro [at] gmail>
> wrote:
> > On Mon, May 7, 2012 at 5:21 PM, Loic Dachary <loic [at] enovance> wrote:
> >> Hi Tomasz,
> > Hi
> >
> >>
> >> I could not agree more and this is the reason why I/O shows in the list
> of meters shown in http://wiki.openstack.org/EfficientMetering (c5) "disk
> IO in megabyte per second has a high impact on the service availability and
> could be billed separately ".
> >
> > Yes but for disk drives I/O (number of read/write ops) are the key
> > resource usage information. It's very hard to setup a billing model
> > for disk drive usage on bandwidth as low bandwidth disk operations
> > (small random read/writes) can utilize disk drive more than huge
> > sequential reads/writes. I need also to mention that AWS is also
> > charging for I/O in their volume service.
> >
> >>
> >>
> >> It looks like you already have a codebase that could be useful for the
> metering implementation. Would you be willing to share it ?
> >
> > Yes. Just give me few days.
> >
> > --
> > Tomasz Paszkowski
> > SS7, Asterisk, SAN, Datacenter, Cloud Computing
> > +48500166299
>
>
>
> --
> Tomasz Paszkowski
> SS7, Asterisk, SAN, Datacenter, Cloud Computing
> +48500166299
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
>


ss7pro at gmail

May 9, 2012, 12:07 PM

Post #35 of 38 (164 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On Wed, May 9, 2012 at 8:02 PM, Doug Hellmann
<doug.hellmann [at] dreamhost> wrote:
>
> Nice!
>
> For production code I think we are going to want to separate collection from
> storage, aren't we? We don't want each compute node to require access to the
> database server (that's an issue with nova that they are trying to fix
> during the folsom release, IIRC).

Yes. Part of the code responsible for amqp support is not functional yet :(



--
Tomasz Paszkowski
SS7, Asterisk, SAN, Datacenter, Cloud Computing
+48500166299

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


doug.hellmann at dreamhost

May 9, 2012, 2:11 PM

Post #36 of 38 (159 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On Wed, May 9, 2012 at 3:07 PM, Tomasz Paszkowski <ss7pro [at] gmail> wrote:

> On Wed, May 9, 2012 at 8:02 PM, Doug Hellmann
> <doug.hellmann [at] dreamhost> wrote:
> >
> > Nice!
> >
> > For production code I think we are going to want to separate collection
> from
> > storage, aren't we? We don't want each compute node to require access to
> the
> > database server (that's an issue with nova that they are trying to fix
> > during the folsom release, IIRC).
>
> Yes. Part of the code responsible for amqp support is not functional yet :(
>

OK, that's what I thought.

We all seem to be reinventing different parts of the services that we will
eventually need, which is good for education but may be wasting a bit of
energy. Is it premature to start talking a little more about architecture
so we can start splitting up the implementation work and focusing that
energy differently? There is a lot of work we can do independently of the
remaining decisions outlined in
http://wiki.openstack.org/Meetings/MeteringAgenda.


>
>
>
> --
> Tomasz Paszkowski
> SS7, Asterisk, SAN, Datacenter, Cloud Computing
> +48500166299
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
>


ss7pro at gmail

May 9, 2012, 3:00 PM

Post #37 of 38 (157 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

I agree. Do you have any plans how to coordinate our efforts ?


On Wed, May 9, 2012 at 11:11 PM, Doug Hellmann
<doug.hellmann [at] dreamhost> wrote:
>
>
> On Wed, May 9, 2012 at 3:07 PM, Tomasz Paszkowski <ss7pro [at] gmail> wrote:
>>
>> On Wed, May 9, 2012 at 8:02 PM, Doug Hellmann
>> <doug.hellmann [at] dreamhost> wrote:
>> >
>> > Nice!
>> >
>> > For production code I think we are going to want to separate collection
>> > from
>> > storage, aren't we? We don't want each compute node to require access to
>> > the
>> > database server (that's an issue with nova that they are trying to fix
>> > during the folsom release, IIRC).
>>
>> Yes. Part of the code responsible for amqp support is not functional yet
>> :(
>
>
> OK, that's what I thought.
>
> We all seem to be reinventing different parts of the services that we will
> eventually need, which is good for education but may be wasting a bit of
> energy. Is it premature to start talking a little more about architecture so
> we can start splitting up the implementation work and focusing that energy
> differently? There is a lot of work we can do independently of the remaining
> decisions outlined inhttp://wiki.openstack.org/Meetings/MeteringAgenda.
>
>>
>>
>>
>>
>> --
>> Tomasz Paszkowski
>> SS7, Asterisk, SAN, Datacenter, Cloud Computing
>> +48500166299
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack
>> Post to : openstack [at] lists
>> Unsubscribe : https://launchpad.net/~openstack
>> More help : https://help.launchpad.net/ListHelp
>
>



--
Tomasz Paszkowski
SS7, Asterisk, SAN, Datacenter, Cloud Computing
+48500166299

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


loic at enovance

May 10, 2012, 1:05 AM

Post #38 of 38 (159 views)
Permalink
Re: [Metering] schema and counter definitions [In reply to]

On 05/09/2012 11:11 PM, Doug Hellmann wrote:
>
>
> On Wed, May 9, 2012 at 3:07 PM, Tomasz Paszkowski <ss7pro [at] gmail <mailto:ss7pro [at] gmail>> wrote:
>
> On Wed, May 9, 2012 at 8:02 PM, Doug Hellmann
> <doug.hellmann [at] dreamhost <mailto:doug.hellmann [at] dreamhost>> wrote:
> >
> > Nice!
> >
> > For production code I think we are going to want to separate collection from
> > storage, aren't we? We don't want each compute node to require access to the
> > database server (that's an issue with nova that they are trying to fix
> > during the folsom release, IIRC).
>
> Yes. Part of the code responsible for amqp support is not functional yet :(
>
>
> OK, that's what I thought.
>
> We all seem to be reinventing different parts of the services that we will eventually need, which is good for education but may be wasting a bit of energy. Is it premature to start talking a little more about architecture so we can start splitting up the implementation work and focusing that energy differently? There is a lot of work we can do independently of the remaining decisions outlined in http://wiki.openstack.org/Meetings/MeteringAgenda.
Hi,

It looks like the architecture of metering is indeed always implemented in similar ways. I had discussions with a company yesterday about their own metering implementation (which will be used in production soon) and it also has an architecture matching what has been proposed so far in ceilometer. I added a few points to the architecture chapter in the wiki:

http://wiki.openstack.org/EfficientMetering#Architecture

including a note summarizing the conclusions of the discussion regarding need for an independent ceilometer agent in addition to the existing meters provided by the OpenStack components.

What do you think ?
>
>
>
>
>
> --
> Tomasz Paszkowski
> SS7, Asterisk, SAN, Datacenter, Cloud Computing
> +48500166299 <tel:%2B48500166299>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack <https://launchpad.net/%7Eopenstack>
> Post to : openstack [at] lists <mailto:openstack [at] lists>
> Unsubscribe : https://launchpad.net/~openstack <https://launchpad.net/%7Eopenstack>
> More help : https://help.launchpad.net/ListHelp
>
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp


--
Loc Dachary Chief Research Officer
// eNovance labs http://labs.enovance.com
// ? loic [at] enovance ? +33 1 49 70 99 82

OpenStack dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.