Analyze results with aggregationsedit
Elasticsearch aggregations enable you to get meta-information about your search results and answer questions like, "How many account holders are in Texas?" or "What’s the average balance of accounts in Tennessee?" You can search documents, filter hits, and use aggregations to analyze the results all in one request.
For example, the following request uses a terms
aggregation to group
all of the accounts in the bank
index by state, and returns the ten states
with the most accounts in descending order:
GET /bank/_search { "size": 0, "aggs": { "group_by_state": { "terms": { "field": "state.keyword" } } } }
The buckets
in the response are the values of the state
field. The
doc_count
shows the number of accounts in each state. For example, you
can see that there are 27 accounts in ID
(Idaho). Because the request
set size=0
, the response only contains the aggregation results.
{ "took": 29, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped" : 0, "failed": 0 }, "hits" : { "total" : { "value": 1000, "relation": "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "group_by_state" : { "doc_count_error_upper_bound": 20, "sum_other_doc_count": 770, "buckets" : [ { "key" : "ID", "doc_count" : 27 }, { "key" : "TX", "doc_count" : 27 }, { "key" : "AL", "doc_count" : 25 }, { "key" : "MD", "doc_count" : 25 }, { "key" : "TN", "doc_count" : 23 }, { "key" : "MA", "doc_count" : 21 }, { "key" : "NC", "doc_count" : 21 }, { "key" : "ND", "doc_count" : 21 }, { "key" : "ME", "doc_count" : 20 }, { "key" : "MO", "doc_count" : 20 } ] } } }
You can combine aggregations to build more complex summaries of your data. For
example, the following request nests an avg
aggregation within the previous
group_by_state
aggregation to calculate the average account balances for
each state.
GET /bank/_search { "size": 0, "aggs": { "group_by_state": { "terms": { "field": "state.keyword" }, "aggs": { "average_balance": { "avg": { "field": "balance" } } } } } }
Instead of sorting the results by count, you could sort using the result of
the nested aggregation by specifying the order within the terms
aggregation:
GET /bank/_search { "size": 0, "aggs": { "group_by_state": { "terms": { "field": "state.keyword", "order": { "average_balance": "desc" } }, "aggs": { "average_balance": { "avg": { "field": "balance" } } } } } }
In addition to basic bucketing and metrics aggregations like these, Elasticsearch provides specialized aggregations for operating on multiple fields and analyzing particular types of data such as dates, IP addresses, and geo data. You can also feed the results of individual aggregations into pipeline aggregations for further analysis.
The core analysis capabilities provided by aggregations enable advanced features such as using machine learning to detect anomalies.