Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metric reduction ignores context changes #297

Open
arov00 opened this issue Jan 12, 2025 · 0 comments
Open

Metric reduction ignores context changes #297

arov00 opened this issue Jan 12, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@arov00
Copy link
Contributor

arov00 commented Jan 12, 2025

It seems that there is an edge case in which the metric reduction would yield incorrect results.

Consider the initial part of the stream topology:

telegrafDataStream
                .peek(
                        (key, value) -> metricRepository.addMetricName(value.name(), value.timestamp()),
                        Named.as("populate-metric-names-list"))
                .selectKey(
                        (key, value) -> value.name(), Named.as("key-by-metricName")) // rekey by raw metric name
                .mapValues(this::addContextToRawData, Named.as("add-context")) // map to raw telegraf data
                .filter(
                        (key, value) -> (value != null && value.getEntityType() != null && value.getName() != null),
                        Named.as("filter-null"))
                .selectKey(
                        (key, value) -> value.getEntityType() + "_" + value.getName() + "_" + value.getInitialMetricName(),
                        Named.as("unique-key-for-windowing"))
                .groupByKey(Grouped.as("group-by-key"))
                .windowedBy(tumblingWindow)
                .reduce(metricReducer, Named.as("sum-aggregated-value-by-window"))

Notice that in the second selectKey, we key by entity type, name and raw metric name. Subsequently, we group by key. In the tumbling window, we then apply the metricReducer, which updates the value of the aggregate metric either by summing all metrics in the window together or by computing the max measurement in the group.

What happens if the context for the metric changes? For example, let's say we use the context to track topic ownership. If we reassign topic ownership to a different team halfway into the tumbling window, all metrics generated in that window would still get reduced into a single aggregate metric that contains the old context. Thus, the old team would still be billed for the costs incurred even after the ownership transfer. The situation would resolve itself when a new window is opened.

A possible solution could be to add the context (or its hash) to the second selectKey call? This would reduce all metrics after the context change into a separate value.

@arov00 arov00 added the bug Something isn't working label Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant