WARNING: The 2.x versions of Elasticsearch have passed their EOL dates. If you are running a 2.x version, we strongly advise you to upgrade.
This documentation is no longer maintained and may be removed. For the latest information, see the current Elasticsearch documentation.
Proximity for Relevanceedit
Although proximity queries are useful, the fact that they require all terms to be
present can make them overly strict. It’s the same issue that we discussed in
Controlling Precision in Full-Text Search: if six out of seven terms match,
a document is probably relevant enough to be worth showing to the user, but
the match_phrase
query would exclude it.
Instead of using proximity matching as an absolute requirement, we can use it as a signal—as one of potentially many queries, each of which contributes to the overall score for each document (see Most Fields).
The fact that we want to add together the scores from multiple queries implies
that we should combine them by using the bool
query.
We can use a simple match
query as a must
clause. This is the query that
will determine which documents are included in our result set. We can trim
the long tail with the minimum_should_match
parameter. Then we can add other,
more specific queries as should
clauses. Every one that matches will
increase the relevance of the matching docs.
GET /my_index/my_type/_search { "query": { "bool": { "must": { "match": { "title": { "query": "quick brown fox", "minimum_should_match": "30%" } } }, "should": { "match_phrase": { "title": { "query": "quick brown fox", "slop": 50 } } } } } }
The |
|
The |
We could, of course, include other queries in the should
clause, where each
query targets a specific aspect of relevance.
- Elasticsearch - The Definitive Guide:
- Foreword
- Preface
- Getting Started
- You Know, for Search…
- Installing and Running Elasticsearch
- Talking to Elasticsearch
- Document Oriented
- Finding Your Feet
- Indexing Employee Documents
- Retrieving a Document
- Search Lite
- Search with Query DSL
- More-Complicated Searches
- Full-Text Search
- Phrase Search
- Highlighting Our Searches
- Analytics
- Tutorial Conclusion
- Distributed Nature
- Next Steps
- Life Inside a Cluster
- Data In, Data Out
- What Is a Document?
- Document Metadata
- Indexing a Document
- Retrieving a Document
- Checking Whether a Document Exists
- Updating a Whole Document
- Creating a New Document
- Deleting a Document
- Dealing with Conflicts
- Optimistic Concurrency Control
- Partial Updates to Documents
- Retrieving Multiple Documents
- Cheaper in Bulk
- Distributed Document Store
- Searching—The Basic Tools
- Mapping and Analysis
- Full-Body Search
- Sorting and Relevance
- Distributed Search Execution
- Index Management
- Inside a Shard
- You Know, for Search…
- Search in Depth
- Structured Search
- Full-Text Search
- Multifield Search
- Proximity Matching
- Partial Matching
- Controlling Relevance
- Theory Behind Relevance Scoring
- Lucene’s Practical Scoring Function
- Query-Time Boosting
- Manipulating Relevance with Query Structure
- Not Quite Not
- Ignoring TF/IDF
- function_score Query
- Boosting by Popularity
- Boosting Filtered Subsets
- Random Scoring
- The Closer, The Better
- Understanding the price Clause
- Scoring with Scripts
- Pluggable Similarity Algorithms
- Changing Similarities
- Relevance Tuning Is the Last 10%
- Dealing with Human Language
- Aggregations
- Geolocation
- Modeling Your Data
- Administration, Monitoring, and Deployment