Percentiles Bucket Aggregation | Elasticsearch Guide [7.7]

原文地址: https://www.elastic.co/guide/en/elasticsearch/reference/7.7/search-aggregations-pipeline-percentiles-bucket-aggregation.html, 原文档版权归 www.elastic.co 所有

IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

» » »

« Derivative Aggregation Moving Average Aggregation »

Percentiles Bucket Aggregationedit

A sibling pipeline aggregation which calculates percentiles across all bucket of a specified metric in a sibling aggregation. The specified metric must be numeric and the sibling aggregation must be a multi-bucket aggregation.

Syntaxedit

A percentiles_bucket aggregation looks like this in isolation:

{
    "percentiles_bucket": {
        "buckets_path": "the_sum"
    }
}

Table 17. percentiles_bucket Parameters

Parameter Name	Description	Required	Default Value
`buckets_path`	The path to the buckets we wish to find the percentiles for (see `buckets_path` Syntax for more details)	Required
`gap_policy`	The policy to apply when gaps are found in the data (see Dealing with gaps in the data for more details)	Optional	`skip`
`format`	format to apply to the output value of this aggregation	Optional	`null`
`percents`	The list of percentiles to calculate	Optional	`[ 1, 5, 25, 50, 75, 95, 99 ]`
`keyed`	Flag which returns the range as an hash instead of an array of key-value pairs	Optional	`true`

The following snippet calculates the percentiles for the total monthly sales buckets:

POST /sales/_search
{
    "size": 0,
    "aggs" : {
        "sales_per_month" : {
            "date_histogram" : {
                "field" : "date",
                "calendar_interval" : "month"
            },
            "aggs": {
                "sales": {
                    "sum": {
                        "field": "price"
                    }
                }
            }
        },
        "percentiles_monthly_sales": {
            "percentiles_bucket": {
                "buckets_path": "sales_per_month>sales", 
                "percents": [ 25.0, 50.0, 75.0 ] 
            }
        }
    }
}

	`buckets_path` instructs this percentiles_bucket aggregation that we want to calculate percentiles for the `sales` aggregation in the `sales_per_month` date histogram.
	`percents` specifies which percentiles we wish to calculate, in this case, the 25th, 50th and 75th percentiles.

And the following may be the response:

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "sales": {
                  "value": 550.0
               }
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "sales": {
                  "value": 60.0
               }
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2,
               "sales": {
                  "value": 375.0
               }
            }
         ]
      },
      "percentiles_monthly_sales": {
        "values" : {
            "25.0": 375.0,
            "50.0": 375.0,
            "75.0": 550.0
         }
      }
   }
}

Percentiles_bucket implementationedit

The Percentile Bucket returns the nearest input data point that is not greater than the requested percentile; it does not interpolate between data points.

The percentiles are calculated exactly and is not an approximation (unlike the Percentiles Metric). This means the implementation maintains an in-memory, sorted list of your data to compute the percentiles, before discarding the data. You may run into memory pressure issues if you attempt to calculate percentiles over many millions of data-points in a single percentiles_bucket.

« Derivative Aggregation Moving Average Aggregation »