Mapping Rules
Mapping rules are used to configure the storage policy for metrics. The storage policy
determines how long to store metrics for and at what resolution to keep them at.
For example, a storage policy of 1m:48h
tells M3 to keep the metrics for 48hrs
at a
1min
resolution. Mapping rules can be configured in the m3coordinator
configuration file
under the downsample
> rules
> mappingRules
stanza. We will use the following as an
example.
downsample:
rules:
mappingRules:
- name: "mysql metrics"
filter: "app:mysql*"
aggregations: ["Last"]
storagePolicies:
- resolution: 1m
retention: 48h
- name: "nginx metrics"
filter: "app:nginx*"
aggregations: ["Last"]
storagePolicies:
- resolution: 30s
retention: 24h
- resolution: 1m
retention: 48h
Here, we have two mapping rules configured – one for mysql
metrics and one for nginx
metrics. The filter determines what metrics each rule applies to. The mysql metrics
rule
will apply to any metrics where the app
tag contains mysql*
as the value (*
being a wildcard).
Similarly, the nginx metrics
rule will apply to all metrics where the app
tag contains
nginx*
as the value.
The aggregations
field determines what functions to apply to the datapoints within a
resolution tile. For example, if an application emits a metric every 10sec
and the resolution
for that metrics’s storage policy is 1min
, M3 will need to combine 6 datapoints. If the aggregations
policy is Last
, M3 will take the last value in that 1min
bucket. aggregations
can be one
of the following:
Last
Min
Max
Mean
Median
Count
Sum
SumSq
Stdev
P10
P20
P30
P40
P50
P60
P70
P80
P90
P95
P99
P999
P9999
Lastly, the storagePolicies
field determines which namespaces to store the metrics in. For example,
the mysql
metrics will be sent to the 1m:48h
namespace, while the nginx
metrics will be sent to
both the 1m:48h
and 30s:24h
namespaces.
Note: the namespaces listed under the storagePolicies
stanza must exist in M3DB.
Rollup Rules
Rollup rules are used to rollup metrics and aggregate in different ways by arbitrary dimensions before they are stored.
Aggregating counters example
Here’s an example of creating a new monotonic counter called
http_request_rollup_no_pod_bucket
from a set of histogram metrics originally
called http_request_bucket
:
downsample:
rules:
rollupRules:
- name: "http_request latency by route and git_sha without pod"
filter: "__name__:http_request_bucket k8s_pod:* le:* git_sha:* route:*"
transforms:
- transform:
type: "Increase"
- rollup:
metricName: "http_request_rollup_no_pod_bucket"
groupBy: ["le", "git_sha", "route", "status_code", "region"]
aggregations: ["Sum"]
- transform:
type: "Add"
storagePolicies:
- resolution: 30s
retention: 720h
Note: only metrics that contain all of the group_by
tags will be rolled up.
For example, in the above config, only http_request_bucket
metrics that
have all of the group_by
labels present will be rolled up into the new
metric http_request_rollup_no_pod_bucket
.
While the above example can be used to create a new rolled up metric,
often times the goal of rollup rules is to eliminate the underlaying,
raw metrics. In order to do this, a mappingRule
will need to be
added like in the following example (using the metric above as an example)
with drop
set to true
. Additionally, if all of the underlaying metrics are
being dropped, there is no need to change the metric name (e.g. in the
rollupRule
, the metricName
field can be equal to the existing metric) –
see below for an example.
downsample:
rules:
mappingRules:
- name: "http_request latency by route and git_sha drop raw"
filter: "__name__:http_request_bucket k8s_pod:* le:* git_sha:* route:*"
drop: true
rollupRules:
- name: "http_request latency by route and git_sha without pod"
filter: "__name__:http_request_bucket k8s_pod:* le:* git_sha:* route:*"
transforms:
- transform:
type: "Increase"
- rollup:
metricName: "http_request_bucket" # metric name doesn't change
groupBy: ["le", "git_sha", "route", "status_code", "region"]
aggregations: ["Sum"]
- transform:
type: "Add"
storagePolicies:
- resolution: 30s
retention: 720h
Storage policies and rollup rules
Note: In order to store rolled up metrics in an unaggregated
namespace, the namespace’s aggregationOptions
must have a matching aggregation
. For example, if in the above rule, the 720h
namespace under storagePolicies
is unaggregated
, the aggregationOptions
for that namespace should resemble the following:
"aggregationOptions": {
"aggregations": [
{
"aggregated": false
},
{
"aggregated": true,
"attributes": {
"resolutionDuration": "30s",
"downsampleOptions": { "all": false }
}
}
]
}
Aggregating gauges example
The following is an example of a sensible set of aggregations across an example metric which represents a job queue length. The aggregations provide the sum, average, max and min across all instances for the job queue length with different aggregate metric names.
downsample:
rules:
rollupRules:
- name: "job queue length sum across pods pod"
filter: "__name__:job_queue_length k8s_pod:*"
transforms:
- aggregate:
type: "Last"
- rollup:
metricName: "job_queue_length:sum"
excludeBy: ["k8s_pod"]
aggregations: ["Sum"]
storagePolicies:
- resolution: 30s
retention: 720h
- name: "job queue length average across pods pod"
filter: "__name__:job_queue_length k8s_pod:*"
transforms:
- aggregate:
type: "Last"
- rollup:
metricName: "job_queue_length:avg"
excludeBy: ["k8s_pod"]
aggregations: ["Mean"]
storagePolicies:
- resolution: 30s
retention: 720h
- name: "job queue length max across pods pod"
filter: "__name__:job_queue_length k8s_pod:*"
transforms:
- aggregate:
type: "Last"
- rollup:
metricName: "job_queue_length:max"
excludeBy: ["k8s_pod"]
aggregations: ["Max"]
storagePolicies:
- resolution: 30s
retention: 720h
- name: "job queue length min across pods pod"
filter: "__name__:job_queue_length k8s_pod:*"
transforms:
- aggregate:
type: "Last"
- rollup:
metricName: "job_queue_length:min"
excludeBy: ["k8s_pod"]
aggregations: ["Min"]
storagePolicies:
- resolution: 30s
retention: 720h