Geo Centroid Aggregationedit

A metric aggregation that computes the weighted centroid from all coordinate values for a Geo-point field.

Example:

PUT /museums
{
    "mappings": {
        "properties": {
            "location": {
                "type": "geo_point"
            }
        }
    }
}

POST /museums/_bulk?refresh
{"index":{"_id":1}}
{"location": "52.374081,4.912350", "city": "Amsterdam", "name": "NEMO Science Museum"}
{"index":{"_id":2}}
{"location": "52.369219,4.901618", "city": "Amsterdam", "name": "Museum Het Rembrandthuis"}
{"index":{"_id":3}}
{"location": "52.371667,4.914722", "city": "Amsterdam", "name": "Nederlands Scheepvaartmuseum"}
{"index":{"_id":4}}
{"location": "51.222900,4.405200", "city": "Antwerp", "name": "Letterenhuis"}
{"index":{"_id":5}}
{"location": "48.861111,2.336389", "city": "Paris", "name": "Musée du Louvre"}
{"index":{"_id":6}}
{"location": "48.860000,2.327000", "city": "Paris", "name": "Musée d'Orsay"}

POST /museums/_search?size=0
{
    "aggs" : {
        "centroid" : {
            "geo_centroid" : {
                "field" : "location" 
            }
        }
    }
}

The geo_centroid aggregation specifies the field to use for computing the centroid. (NOTE: field must be a Geo-point type)

The above aggregation demonstrates how one would compute the centroid of the location field for all documents with a crime type of burglary

The response for the above aggregation:

{
    ...
    "aggregations": {
        "centroid": {
            "location": {
                "lat": 51.00982965203002,
                "lon": 3.9662131341174245
            },
            "count": 6
        }
    }
}

The geo_centroid aggregation is more interesting when combined as a sub-aggregation to other bucket aggregations.

Example:

POST /museums/_search?size=0
{
    "aggs" : {
        "cities" : {
            "terms" : { "field" : "city.keyword" },
            "aggs" : {
                "centroid" : {
                    "geo_centroid" : { "field" : "location" }
                }
            }
        }
    }
}

The above example uses geo_centroid as a sub-aggregation to a terms bucket aggregation for finding the central location for museums in each city.

The response for the above aggregation:

{
    ...
    "aggregations": {
        "cities": {
            "sum_other_doc_count": 0,
            "doc_count_error_upper_bound": 0,
            "buckets": [
               {
                   "key": "Amsterdam",
                   "doc_count": 3,
                   "centroid": {
                      "location": {
                         "lat": 52.371655656024814,
                         "lon": 4.909563297405839
                      },
                      "count": 3
                   }
               },
               {
                   "key": "Paris",
                   "doc_count": 2,
                   "centroid": {
                      "location": {
                         "lat": 48.86055548675358,
                         "lon": 2.3316944623366
                      },
                      "count": 2
                   }
                },
                {
                    "key": "Antwerp",
                    "doc_count": 1,
                    "centroid": {
                       "location": {
                          "lat": 51.22289997059852,
                          "lon": 4.40519998781383
                       },
                       "count": 1
                    }
                 }
            ]
        }
    }
}

Using geo_centroid as a sub-aggregation of geohash_grid

The geohash_grid aggregation places documents, not individual geo-points, into buckets. If a document’s geo_point field contains multiple values, the document could be assigned to multiple buckets, even if one or more of its geo-points are outside the bucket boundaries.

If a geocentroid sub-aggregation is also used, each centroid is calculated using all geo-points in a bucket, including those outside the bucket boundaries. This can result in centroids outside of bucket boundaries.