WARNING: The 2.x versions of Elasticsearch have passed their EOL dates. If you are running a 2.x version, we strongly advise you to upgrade.
This documentation is no longer maintained and may be removed. For the latest information, see the current Elasticsearch documentation.
Finding Multiple Exact Valuesedit
The term
query is useful for finding a single value, but often you’ll want
to search for multiple values.
What if you want to
find documents that have a price of $20 or $30?
Rather than using multiple term
queries, you can instead use a single terms
query (note the s at the end). The terms
query is simply the plural
version of the singular term
query cousin.
It looks nearly identical to a vanilla term
too. Instead of
specifying a single price, we are now specifying an array of values:
{ "terms" : { "price" : [20, 30] } }
And like the term
query, we will place it inside the filter
clause of a
constant score query to use it:
GET /my_store/products/_search { "query" : { "constant_score" : { "filter" : { "terms" : { "price" : [20, 30] } } } } }
The query will return the second, third, and fourth documents:
"hits" : [ { "_id" : "2", "_score" : 1.0, "_source" : { "price" : 20, "productID" : "KDKE-B-9947-#kL5" } }, { "_id" : "3", "_score" : 1.0, "_source" : { "price" : 30, "productID" : "JODL-X-1937-#pV7" } }, { "_id": "4", "_score": 1.0, "_source": { "price": 30, "productID": "QQPX-R-3956-#aD8" } } ]
Contains, but Does Not Equaledit
It is important to understand that term
and terms
are contains operations,
not equals.
What does that mean?
If you have a term query for { "term" : { "tags" : "search" } }
, it will match
both of the following documents:
Recall how the term
query works: it checks the inverted index for all
documents that contain a term, and then constructs a bitset. In our simple
example, we have the following inverted index:
Token |
DocIDs |
|
|
|
|
When a term
query is executed for the token search
, it goes straight to the
corresponding entry in the inverted index and extracts the associated doc IDs.
As you can see, both document 1 and document 2 contain the token in the inverted index.
Therefore, they are both returned as a result.
The nature of an inverted index also means that entire field equality is rather difficult to calculate. How would you determine whether a particular document contains only your request term? You would have to find the term in the inverted index, extract the document IDs, and then scan every row in the inverted index, looking for those IDs to see whether a doc has any other terms.
As you might imagine, that would be tremendously inefficient and expensive.
For that reason, term
and terms
are must contain operations, not
must equal exactly.
Equals Exactlyedit
If you do want that behavior—entire field equality—the best way to accomplish it involves indexing a secondary field. In this field, you index the number of values that your field contains. Using our two previous documents, we now include a field that maintains the number of tags:
{ "tags" : ["search"], "tag_count" : 1 } { "tags" : ["search", "open_source"], "tag_count" : 2 }
Once you have the count information indexed, you can construct a constant_score
that enforces the appropriate number of terms:
GET /my_index/my_type/_search { "query": { "constant_score" : { "filter" : { "bool" : { "must" : [ { "term" : { "tags" : "search" } }, { "term" : { "tag_count" : 1 } } ] } } } } }
This query will now match only the document that has a single tag that is
search
, rather than any document that contains search
.
- Elasticsearch - The Definitive Guide:
- Foreword
- Preface
- Getting Started
- You Know, for Search…
- Installing and Running Elasticsearch
- Talking to Elasticsearch
- Document Oriented
- Finding Your Feet
- Indexing Employee Documents
- Retrieving a Document
- Search Lite
- Search with Query DSL
- More-Complicated Searches
- Full-Text Search
- Phrase Search
- Highlighting Our Searches
- Analytics
- Tutorial Conclusion
- Distributed Nature
- Next Steps
- Life Inside a Cluster
- Data In, Data Out
- What Is a Document?
- Document Metadata
- Indexing a Document
- Retrieving a Document
- Checking Whether a Document Exists
- Updating a Whole Document
- Creating a New Document
- Deleting a Document
- Dealing with Conflicts
- Optimistic Concurrency Control
- Partial Updates to Documents
- Retrieving Multiple Documents
- Cheaper in Bulk
- Distributed Document Store
- Searching—The Basic Tools
- Mapping and Analysis
- Full-Body Search
- Sorting and Relevance
- Distributed Search Execution
- Index Management
- Inside a Shard
- You Know, for Search…
- Search in Depth
- Structured Search
- Full-Text Search
- Multifield Search
- Proximity Matching
- Partial Matching
- Controlling Relevance
- Theory Behind Relevance Scoring
- Lucene’s Practical Scoring Function
- Query-Time Boosting
- Manipulating Relevance with Query Structure
- Not Quite Not
- Ignoring TF/IDF
- function_score Query
- Boosting by Popularity
- Boosting Filtered Subsets
- Random Scoring
- The Closer, The Better
- Understanding the price Clause
- Scoring with Scripts
- Pluggable Similarity Algorithms
- Changing Similarities
- Relevance Tuning Is the Last 10%
- Dealing with Human Language
- Aggregations
- Geolocation
- Modeling Your Data
- Administration, Monitoring, and Deployment