Mapping Rules
Mapping rules are used to configure the storage policy for metrics. The storage policy
determines how long to store metrics for and at what resolution to keep them at.
For example, a storage policy of 1m:48h tells M3 to keep the metrics for 48hrs at a
1min resolution. Mapping rules can be configured in the m3coordinator configuration file
under the downsample > rules > mappingRules stanza. We will use the following as an
example.
downsample:
rules:
mappingRules:
- name: "mysql metrics"
filter: "app:mysql*"
aggregations: ["Last"]
storagePolicies:
- resolution: 1m
retention: 48h
- name: "nginx metrics"
filter: "app:nginx*"
aggregations: ["Last"]
storagePolicies:
- resolution: 30s
retention: 24h
- resolution: 1m
retention: 48h
Here, we have two mapping rules configured – one for mysql metrics and one for nginx
metrics. The filter determines what metrics each rule applies to. The mysql metrics rule
will apply to any metrics where the app tag contains mysql* as the value (* being a wildcard).
Similarly, the nginx metrics rule will apply to all metrics where the app tag contains
nginx* as the value.
The aggregations field determines what functions to apply to the datapoints within a
resolution tile. For example, if an application emits a metric every 10sec and the resolution
for that metrics’s storage policy is 1min, M3 will need to combine 6 datapoints. If the aggregations
policy is Last, M3 will take the last value in that 1min bucket. aggregations can be one
of the following:
Last
Min
Max
Mean
Median
Count
Sum
SumSq
Stdev
P10
P20
P30
P40
P50
P60
P70
P80
P90
P95
P99
P999
P9999
Lastly, the storagePolicies field determines which namespaces to store the metrics in. For example,
the mysql metrics will be sent to the 1m:48h namespace, while the nginx metrics will be sent to
both the 1m:48h and 30s:24h namespaces.
Note: the namespaces listed under the storagePolicies stanza must exist in M3DB.
Rollup Rules
Rollup rules are used to rollup metrics and aggregate in different ways by arbitrary dimensions before they are stored.
Aggregating counters example
Here’s an example of creating a new monotonic counter called
http_request_rollup_no_pod_bucket from a set of histogram metrics originally
called http_request_bucket:
downsample:
rules:
rollupRules:
- name: "http_request latency by route and git_sha without pod"
filter: "__name__:http_request_bucket k8s_pod:* le:* git_sha:* route:*"
transforms:
- transform:
type: "Increase"
- rollup:
metricName: "http_request_rollup_no_pod_bucket"
groupBy: ["le", "git_sha", "route", "status_code", "region"]
aggregations: ["Sum"]
- transform:
type: "Add"
storagePolicies:
- resolution: 30s
retention: 720h
Note: only metrics that contain all of the group_by tags will be rolled up.
For example, in the above config, only http_request_bucket metrics that
have all of the group_by labels present will be rolled up into the new
metric http_request_rollup_no_pod_bucket.
While the above example can be used to create a new rolled up metric,
often times the goal of rollup rules is to eliminate the underlaying,
raw metrics. In order to do this, a mappingRule will need to be
added like in the following example (using the metric above as an example)
with drop set to true. Additionally, if all of the underlaying metrics are
being dropped, there is no need to change the metric name (e.g. in the
rollupRule, the metricName field can be equal to the existing metric) –
see below for an example.
downsample:
rules:
mappingRules:
- name: "http_request latency by route and git_sha drop raw"
filter: "__name__:http_request_bucket k8s_pod:* le:* git_sha:* route:*"
drop: true
rollupRules:
- name: "http_request latency by route and git_sha without pod"
filter: "__name__:http_request_bucket k8s_pod:* le:* git_sha:* route:*"
transforms:
- transform:
type: "Increase"
- rollup:
metricName: "http_request_bucket" # metric name doesn't change
groupBy: ["le", "git_sha", "route", "status_code", "region"]
aggregations: ["Sum"]
- transform:
type: "Add"
storagePolicies:
- resolution: 30s
retention: 720h
Storage policies and rollup rules
Note: In order to store rolled up metrics in an unaggregated namespace, the namespace’s aggregationOptions must have a matching aggregation. For example, if in the above rule, the 720h namespace under storagePolicies
is unaggregated, the aggregationOptions for that namespace should resemble the following:
"aggregationOptions": {
"aggregations": [
{
"aggregated": false
},
{
"aggregated": true,
"attributes": {
"resolutionDuration": "30s",
"downsampleOptions": { "all": false }
}
}
]
}
Aggregating gauges example
The following is an example of a sensible set of aggregations across an example metric which represents a job queue length. The aggregations provide the sum, average, max and min across all instances for the job queue length with different aggregate metric names.
downsample:
rules:
rollupRules:
- name: "job queue length sum across pods pod"
filter: "__name__:job_queue_length k8s_pod:*"
transforms:
- aggregate:
type: "Last"
- rollup:
metricName: "job_queue_length:sum"
excludeBy: ["k8s_pod"]
aggregations: ["Sum"]
storagePolicies:
- resolution: 30s
retention: 720h
- name: "job queue length average across pods pod"
filter: "__name__:job_queue_length k8s_pod:*"
transforms:
- aggregate:
type: "Last"
- rollup:
metricName: "job_queue_length:avg"
excludeBy: ["k8s_pod"]
aggregations: ["Mean"]
storagePolicies:
- resolution: 30s
retention: 720h
- name: "job queue length max across pods pod"
filter: "__name__:job_queue_length k8s_pod:*"
transforms:
- aggregate:
type: "Last"
- rollup:
metricName: "job_queue_length:max"
excludeBy: ["k8s_pod"]
aggregations: ["Max"]
storagePolicies:
- resolution: 30s
retention: 720h
- name: "job queue length min across pods pod"
filter: "__name__:job_queue_length k8s_pod:*"
transforms:
- aggregate:
type: "Last"
- rollup:
metricName: "job_queue_length:min"
excludeBy: ["k8s_pod"]
aggregations: ["Min"]
storagePolicies:
- resolution: 30s
retention: 720h