WARNING: The 2.x versions of Elasticsearch have passed their EOL dates. If you are running a 2.x version, we strongly advise you to upgrade.
This documentation is no longer maintained and may be removed. For the latest information, see the current Elasticsearch documentation.
Geohash Grid Aggregationedit
The number of results returned by a query may be far too many to display each
geo-point individually on a map. The geohash_grid
aggregation buckets nearby
geo-points together by calculating the geohash for each point, at the level of
precision that you define.
The result is a grid of cells—one cell per geohash—that can be displayed on a map. By changing the precision of the geohash, you can summarize information across the whole world, by country, or by city block.
The aggregation is sparse—it returns only cells that contain documents. If your geohashes are too precise and too many buckets are generated, it will return, by default, the 10,000 most populous cells—those containing the most documents. However, it still needs to generate all the buckets in order to figure out which are the most populous 10,000. You need to control the number of buckets generated by doing the following:
-
Limit the result with a
geo_bounding_box
query. -
Choose an appropriate
precision
for the size of your bounding box.
GET /attractions/restaurant/_search { "size" : 0, "query": { "constant_score": { "filter": { "geo_bounding_box": { "location": { "top_left": { "lat": 40.8, "lon": -74.1 }, "bottom_right": { "lat": 40.4, "lon": -73.7 } } } } } }, "aggs": { "new_york": { "geohash_grid": { "field": "location", "precision": 5 } } } }
The bounding box limits the scope of the search to the greater New York area. |
|
Geohashes of precision |
Geohashes with precision 5
measure about 25km2 each, so 10,000 cells at
this precision would cover 250,000km2. The bounding box that we specified
measures approximately 44km x 33km, or about 1,452km2, so we are well within
safe limits; we definitely won’t create too many buckets in memory.
The response from the preceding request looks like this:
... "aggregations": { "new_york": { "buckets": [ { "key": "dr5rs", "doc_count": 2 }, { "key": "dr5re", "doc_count": 1 } ] } } ...
Again, we didn’t specify any sub-aggregations, so all we got back was the document count. We could have asked for popular restaurant types, average price, or other details.
To plot these buckets on a map, you need a library that understands how to convert a geohash into the equivalent bounding box or central point. Libraries exist in JavaScript and other languages that will perform this conversion for you, but you can also use information from Geo Bounds Aggregation to perform a similar job.
- Elasticsearch - The Definitive Guide:
- Foreword
- Preface
- Getting Started
- You Know, for Search…
- Installing and Running Elasticsearch
- Talking to Elasticsearch
- Document Oriented
- Finding Your Feet
- Indexing Employee Documents
- Retrieving a Document
- Search Lite
- Search with Query DSL
- More-Complicated Searches
- Full-Text Search
- Phrase Search
- Highlighting Our Searches
- Analytics
- Tutorial Conclusion
- Distributed Nature
- Next Steps
- Life Inside a Cluster
- Data In, Data Out
- What Is a Document?
- Document Metadata
- Indexing a Document
- Retrieving a Document
- Checking Whether a Document Exists
- Updating a Whole Document
- Creating a New Document
- Deleting a Document
- Dealing with Conflicts
- Optimistic Concurrency Control
- Partial Updates to Documents
- Retrieving Multiple Documents
- Cheaper in Bulk
- Distributed Document Store
- Searching—The Basic Tools
- Mapping and Analysis
- Full-Body Search
- Sorting and Relevance
- Distributed Search Execution
- Index Management
- Inside a Shard
- You Know, for Search…
- Search in Depth
- Structured Search
- Full-Text Search
- Multifield Search
- Proximity Matching
- Partial Matching
- Controlling Relevance
- Theory Behind Relevance Scoring
- Lucene’s Practical Scoring Function
- Query-Time Boosting
- Manipulating Relevance with Query Structure
- Not Quite Not
- Ignoring TF/IDF
- function_score Query
- Boosting by Popularity
- Boosting Filtered Subsets
- Random Scoring
- The Closer, The Better
- Understanding the price Clause
- Scoring with Scripts
- Pluggable Similarity Algorithms
- Changing Similarities
- Relevance Tuning Is the Last 10%
- Dealing with Human Language
- Aggregations
- Geolocation
- Modeling Your Data
- Administration, Monitoring, and Deployment