WARNING: The 2.x versions of Elasticsearch have passed their EOL dates. If you are running a 2.x version, we strongly advise you to upgrade.
This documentation is no longer maintained and may be removed. For the latest information, see the current Elasticsearch documentation.
Nested Objectsedit
Given the fact that creating, deleting, and updating a single document in
Elasticsearch is atomic, it makes sense to store closely related entities
within the same document. For instance, we could store an order and all of
its order lines in one document, or we could store a blog post and all of its
comments together, by passing an array of comments
:
PUT /my_index/blogpost/1 { "title": "Nest eggs", "body": "Making your money work...", "tags": [ "cash", "shares" ], "comments": [ { "name": "John Smith", "comment": "Great article", "age": 28, "stars": 4, "date": "2014-09-01" }, { "name": "Alice White", "comment": "More like this please", "age": 31, "stars": 5, "date": "2014-10-22" } ] }
If we rely on dynamic mapping, the |
Because all of the content is in the same document, there is no need to join blog posts and comments at query time, so searches perform well.
The problem is that the preceding document would match a query like this:
GET /_search { "query": { "bool": { "must": [ { "match": { "comments.name": "Alice" }}, { "match": { "comments.age": 28 }} ] } } }
The reason for this cross-object matching, as discussed in Arrays of Inner Objects, is that our beautifully structured JSON document is flattened into a simple key-value format in the index that looks like this:
{ "title": [ eggs, nest ], "body": [ making, money, work, your ], "tags": [ cash, shares ], "comments.name": [ alice, john, smith, white ], "comments.comment": [ article, great, like, more, please, this ], "comments.age": [ 28, 31 ], "comments.stars": [ 4, 5 ], "comments.date": [ 2014-09-01, 2014-10-22 ] }
The correlation between Alice
and 31
, or between John
and 2014-09-01
, has been irretrievably lost. While fields of type object
(see
Multilevel Objects) are useful for storing a single object, they are useless,
from a search point of view, for storing an array of objects.
This is the problem that nested objects are designed to solve. By mapping
the comments
field as type nested
instead of type object
, each nested
object is indexed as a hidden separate document, something like this:
{ "comments.name": [ john, smith ], "comments.comment": [ article, great ], "comments.age": [ 28 ], "comments.stars": [ 4 ], "comments.date": [ 2014-09-01 ] } { "comments.name": [ alice, white ], "comments.comment": [ like, more, please, this ], "comments.age": [ 31 ], "comments.stars": [ 5 ], "comments.date": [ 2014-10-22 ] } { "title": [ eggs, nest ], "body": [ making, money, work, your ], "tags": [ cash, shares ] }
By indexing each nested object separately, the fields within the object maintain their relationships. We can run queries that will match only if the match occurs within the same nested object.
Not only that, because of the way that nested objects are indexed, joining the nested documents to the root document at query time is fast—almost as fast as if they were a single document.
These extra nested documents are hidden; we can’t access them directly. To update, add, or remove a nested object, we have to reindex the whole document. It’s important to note that, the result returned by a search request is not the nested object alone; it is the whole document.
- Elasticsearch - The Definitive Guide:
- Foreword
- Preface
- Getting Started
- You Know, for Search…
- Installing and Running Elasticsearch
- Talking to Elasticsearch
- Document Oriented
- Finding Your Feet
- Indexing Employee Documents
- Retrieving a Document
- Search Lite
- Search with Query DSL
- More-Complicated Searches
- Full-Text Search
- Phrase Search
- Highlighting Our Searches
- Analytics
- Tutorial Conclusion
- Distributed Nature
- Next Steps
- Life Inside a Cluster
- Data In, Data Out
- What Is a Document?
- Document Metadata
- Indexing a Document
- Retrieving a Document
- Checking Whether a Document Exists
- Updating a Whole Document
- Creating a New Document
- Deleting a Document
- Dealing with Conflicts
- Optimistic Concurrency Control
- Partial Updates to Documents
- Retrieving Multiple Documents
- Cheaper in Bulk
- Distributed Document Store
- Searching—The Basic Tools
- Mapping and Analysis
- Full-Body Search
- Sorting and Relevance
- Distributed Search Execution
- Index Management
- Inside a Shard
- You Know, for Search…
- Search in Depth
- Structured Search
- Full-Text Search
- Multifield Search
- Proximity Matching
- Partial Matching
- Controlling Relevance
- Theory Behind Relevance Scoring
- Lucene’s Practical Scoring Function
- Query-Time Boosting
- Manipulating Relevance with Query Structure
- Not Quite Not
- Ignoring TF/IDF
- function_score Query
- Boosting by Popularity
- Boosting Filtered Subsets
- Random Scoring
- The Closer, The Better
- Understanding the price Clause
- Scoring with Scripts
- Pluggable Similarity Algorithms
- Changing Similarities
- Relevance Tuning Is the Last 10%
- Dealing with Human Language
- Aggregations
- Geolocation
- Modeling Your Data
- Administration, Monitoring, and Deployment