WARNING: The 2.x versions of Elasticsearch have passed their EOL dates. If you are running a 2.x version, we strongly advise you to upgrade.
This documentation is no longer maintained and may be removed. For the latest information, see the current Elasticsearch documentation.
Combining queries togetheredit
Real world search requests are never simple; they search multiple fields with various input text, and filter based on an array of criteria. To build sophisticated search, you will need a way to combine multiple queries together into a single search request.
To do that, you can use the bool
query. This query combines multiple queries
together in user-defined boolean combinations. This query accepts the following parameters:
-
must
- Clauses that must match for the document to be included.
-
must_not
- Clauses that must not match for the document to be included.
-
should
-
If these clauses match, they increase the
_score
; otherwise, they have no effect. They are simply used to refine the relevance score for each document. -
filter
- Clauses that must match, but are run in non-scoring, filtering mode. These clauses do not contribute to the score, instead they simply include/exclude documents based on their criteria.
Because this is the first query we’ve seen that contains other queries, we need
to talk about how scores are combined. Each sub-query clause will individually
calculate a relevance score for the document. Once these scores are calculated,
the bool
query will merge the scores together and return a single score representing
the total score of the boolean operation.
The following query finds documents whose title
field matches
the query string how to make millions
and that are not marked
as spam
. If any documents are starred
or are from 2014 onward,
they will rank higher than they would have otherwise. Documents that
match both conditions will rank even higher:
{ "bool": { "must": { "match": { "title": "how to make millions" }}, "must_not": { "match": { "tag": "spam" }}, "should": [ { "match": { "tag": "starred" }}, { "range": { "date": { "gte": "2014-01-01" }}} ] } }
If there are no must
clauses, at least one should
clause has to
match. However, if there is at least one must
clause, no should
clauses
are required to match.
Adding a filtering queryedit
If we don’t want the date of the document to affect scoring at all, we can re-arrange
the previous example to use a filter
clause:
{ "bool": { "must": { "match": { "title": "how to make millions" }}, "must_not": { "match": { "tag": "spam" }}, "should": [ { "match": { "tag": "starred" }} ], "filter": { "range": { "date": { "gte": "2014-01-01" }} } } }
By moving the range query into the filter
clause, we have converted it into a
non-scoring query. It will no longer contribute a score to the document’s relevance
ranking. And because it is now a non-scoring query, it can use the variety of optimizations
available to filters which should increase performance.
Any query can be used in this manner. Simply move a query into the
filter
clause of a bool
query and it automatically converts to a non-scoring
filter.
If you need to filter on many different criteria, the bool
query itself can be
used as a non-scoring query. Simply place it inside the filter
clause and
continue building your boolean logic:
{ "bool": { "must": { "match": { "title": "how to make millions" }}, "must_not": { "match": { "tag": "spam" }}, "should": [ { "match": { "tag": "starred" }} ], "filter": { "bool": { "must": [ { "range": { "date": { "gte": "2014-01-01" }}}, { "range": { "price": { "lte": 29.99 }}} ], "must_not": [ { "term": { "category": "ebooks" }} ] } } } }
By mixing and matching where Boolean queries are placed, we can flexibly encode both scoring and filtering logic in our search request.
constant_score Queryedit
Although not used nearly as often as the bool
query, the constant_score
query is
still useful to have in your toolbox. The query applies a static, constant score to
all matching documents. It is predominantly used when you want to execute a filter
and nothing else (e.g. no scoring queries).
You can use this instead of a bool
that only has filter clauses. Performance
will be identical, but it may aid in query simplicity/clarity.
- Elasticsearch - The Definitive Guide:
- Foreword
- Preface
- Getting Started
- You Know, for Search…
- Installing and Running Elasticsearch
- Talking to Elasticsearch
- Document Oriented
- Finding Your Feet
- Indexing Employee Documents
- Retrieving a Document
- Search Lite
- Search with Query DSL
- More-Complicated Searches
- Full-Text Search
- Phrase Search
- Highlighting Our Searches
- Analytics
- Tutorial Conclusion
- Distributed Nature
- Next Steps
- Life Inside a Cluster
- Data In, Data Out
- What Is a Document?
- Document Metadata
- Indexing a Document
- Retrieving a Document
- Checking Whether a Document Exists
- Updating a Whole Document
- Creating a New Document
- Deleting a Document
- Dealing with Conflicts
- Optimistic Concurrency Control
- Partial Updates to Documents
- Retrieving Multiple Documents
- Cheaper in Bulk
- Distributed Document Store
- Searching—The Basic Tools
- Mapping and Analysis
- Full-Body Search
- Sorting and Relevance
- Distributed Search Execution
- Index Management
- Inside a Shard
- You Know, for Search…
- Search in Depth
- Structured Search
- Full-Text Search
- Multifield Search
- Proximity Matching
- Partial Matching
- Controlling Relevance
- Theory Behind Relevance Scoring
- Lucene’s Practical Scoring Function
- Query-Time Boosting
- Manipulating Relevance with Query Structure
- Not Quite Not
- Ignoring TF/IDF
- function_score Query
- Boosting by Popularity
- Boosting Filtered Subsets
- Random Scoring
- The Closer, The Better
- Understanding the price Clause
- Scoring with Scripts
- Pluggable Similarity Algorithms
- Changing Similarities
- Relevance Tuning Is the Last 10%
- Dealing with Human Language
- Aggregations
- Geolocation
- Modeling Your Data
- Administration, Monitoring, and Deployment