-
Notifications
You must be signed in to change notification settings - Fork 86
Metric Types
There are two metric types in Abacus:
- discrete, also known as stateless, "historical" and "log-like"
- time-based, also known as stateful
Below you can find the characteristics of the two metric types in Abacus:
These metrics are stateless. The usage records are submitted in a log-like manner to Abacus. When you request an aggregated result from Abacus, it goes through the history of events and performs calculations, based on the defined formulas. Discrete metrics are usually quite simple and deal with simple numbers in both measures and metrics.
{
name: 'storage',
unit: 'GIGABYTE',
type: 'discrete',
meter: ((m) => new BigNumber(m.storage).div(1073741824).toNumber()),
aggregate: ((a, prev, curr, aggTwCell, accTwCell) => new BigNumber(a || 0).add(curr).toNumber())
}
In this example, the storage value is converted from Bytes to Gigabyte by division. Then, when the end time of the current measure is between the “from” and “to” timestamps, the accumulate function returns the maximum amount of memory that has been used so far. In this case, “to” is the time until which the resource usage must be accumulated. Currently Abacus predefines the “from” and “to” times to refer to the start and end of a month. Although the dimension of the measures is predefined to contain “name” and “unit” only, they are sufficient to meter any kind of resource usage. In addition, the accumulate and aggregate functions can be used to perform different kinds of calculations.
Time-based metrics are stateful. Abacus stores the state of the resource instance and uses it to calculate the result on request.
The linux-container plan contains the gigabytes per hour. The usage is ongoing and grows over time.
It is important to note that time-based metrics often use compound data structures to keep track of the usage. For example, you submit the previous and the current measures of the container resource to calculate the GB/h usage.
The example basic linux container metering plan is a time-based plan. It measures memory consumption over time.
To start application A with an instance of 1 GB, a Resource Provider submits these measures:
current_running_instances: 1,
current_instance_memory: 1073741824,
previous_running_instances: 0,
previous_instance_memory: 0
To update application A with 1 instance of 1 GB to 2 instances of 2 GB, a Resource Provider submits measures:
current_running_instances: 2,
current_instance_memory: 2147483648,
previous_running_instances: 1,
previous_instance_memory: 1073741824
To stop application A, a Resource Provider submits the following measures:
current_running_instances: 0,
current_instance_memory: 0,
previous_running_instances: 2,
previous_instance_memory: 2147483648
The algorithm works like this:
- When the application had consumed memory in the past before it was stopped (or will consume in the future after it is started), it would add negative consumption
- When the application had not consumed memory in the past before it was started (or will not consume in the future after it is stopped), it would add positive consumption
The plan works with out-of-order data submission and guarantees correctness, given there is no missing usage submission. This basically means that the previous usage has to be submitted together with the current one.
Furthermore it works only within the time-window, meaning that the calculated numbers would be wrong if:
- The usage is for the period outside of
from
(start of the month) andto
(end of the month) - The earliest event usage submitted for that time period ('from' -> 'to') is not a start (with previous values set to 0)
Internally, the metrics use a compound data structure consisting of:
-
consuming
: the latest GB (event time) -
consumed
: the "memory balance" that the app has consumed. The number is relative to the time boundary as described above.
Example 1:
Let's go through the formula with a simple example:
- If the time period is from the 1st to the 30th of a given month, we have
start=1
andend=30
- An application starts consuming 1 GB on the 20th of the given month.
-
consumed
will be the amount that the app is not consuming (from the start of the month till 20th). - From the 20th till the end of the month the app will consume
= 20 - 10 * direction(+1) = 10
If a Report Consumer requests a report on the 30th, then consumed
will be the amount that the app has been consuming (start of the month till the 30th) + the amount that the app would be idle * direction(-1) / 2 = (10 - 30 + 0) * -1 / 2 = 10.
Example 2:
If there is a stop event on the 25th: consuming = 0
, then consumed
will be: the previous consumed - the amount that the app has been consuming (start -> 25th) + the amount that the app would be idle (25th -> end) = 10 - 25 + 5 = -10 * direction(-1) = 10
If a Report Consumer requests a report on the 30th, since consuming
is 0, we will calculate consumed
as (10) / 2 = 5.
Example 3:
Let's use a real example of a submission:
- An hour window
from: 1467280800000 (Thu Jun 30 2016 03:00:00 GMT-0700 (PDT))
andto: 1467284400000 (Thu Jun 30 2016 04:00:00 GMT-0700 (PDT))
event time: 1467283200000 (Thu Jun 30 2016 03:40:00 GMT-0700 (PDT))
consuming = 1 GB
- A Report Consumer requests a report at the end of the time window (
to
) - The application has been consuming 1 GB for 20 minutes:
1 GB * 20 minutes / 1 hour = 0.33333 GB/h
The result of this submission in the pipeline would be:
consuming = 1
consumed = 1 * ((1467280800000 - 1467283200000) + (1467284400000 - 1467283200000)) = -1200000
-
since: 1467283200000
(used to keep track of the most up-to-date consuming)
The consumed
would be negative because this is relative to the from
and to
window. If the event time is > 1/2 of the window, it will results in a negative number. This is fine, because when on report generation, the summarize
function would make sense of the number.
If a Report Consumer requests a summary at the end of the window to: 1467284400000 (Thu Jun 30 2016 04:00:00 GMT-0700 (PDT))
, we will get:
consumed = current consuming * -1 * ((1467280800000 - 1467284400000) + (1467284400000 - 1467284400000)) = 3600000
summary = (current consumed + consumed) / 2 / 3600000 = (-1200000 + 3600000) / 2 / 3600000 = 0.33 GB/h
That's exactly the amount the instance has consumed in the window: 20 / 60 = 0.33333 GB/h
The time-based usage metrics are carried over into each new monthly database partition by the cf-renewer application. It transfers the active resource consumption from the previous month into the current one.
Warning:
The cf-renewer application supports only plans with "pure" time-based metrics. This means that any usage documents with a metering plan that has both discrete and time-based metrics will be ignored!
ABOUT | RESOURCE PROVIDER | ABACUS INTEGRATOR
*Abacus icon made by Freepik from www.flaticon.com