WARNING: The 2.x versions of Elasticsearch have passed their EOL dates. If you are running a 2.x version, we strongly advise you to upgrade.
This documentation is no longer maintained and may be removed. For the latest information, see the current Elasticsearch documentation.
The match Queryedit
The match
query is the go-to query—the first query that you should
reach for whenever you need to query any field. It is a high-level full-text
query, meaning that it knows how to deal with both full-text fields and exact-value fields.
That said, the main use case for the match
query is for full-text search. So
let’s take a look at how full-text search works with a simple example.
Index Some Dataedit
First, we’ll create a new index and index some documents using the
bulk
API:
DELETE /my_index PUT /my_index { "settings": { "number_of_shards": 1 }} POST /my_index/my_type/_bulk { "index": { "_id": 1 }} { "title": "The quick brown fox" } { "index": { "_id": 2 }} { "title": "The quick brown fox jumps over the lazy dog" } { "index": { "_id": 3 }} { "title": "The quick brown fox jumps over the quick dog" } { "index": { "_id": 4 }} { "title": "Brown fox brown dog" }
Delete the index in case it already exists. |
|
Later, in Relevance Is Broken!, we explain why we created this index with only one primary shard. |
A Single-Word Queryedit
Our first example explains what happens when we use the match
query to
search within a full-text field for a single word:
GET /my_index/my_type/_search { "query": { "match": { "title": "QUICK!" } } }
Elasticsearch executes the preceding match
query as follows:
-
Check the field type.
The
title
field is a full-text (analyzed
)string
field, which means that the query string should be analyzed too. -
Analyze the query string.
The query string
QUICK!
is passed through the standard analyzer, which results in the single termquick
. Because we have just a single term, thematch
query can be executed as a single low-levelterm
query. -
Find matching docs.
The
term
query looks upquick
in the inverted index and retrieves the list of documents that contain that term—in this case, documents 1, 2, and 3. -
Score each doc.
The
term
query calculates the relevance_score
for each matching document, by combining the term frequency (how oftenquick
appears in thetitle
field of each document), with the inverse document frequency (how oftenquick
appears in thetitle
field in all documents in the index), and the length of each field (shorter fields are considered more relevant). See What Is Relevance?.
This process gives us the following (abbreviated) results:
"hits": [ { "_id": "1", "_score": 0.5, "_source": { "title": "The quick brown fox" } }, { "_id": "3", "_score": 0.44194174, "_source": { "title": "The quick brown fox jumps over the quick dog" } }, { "_id": "2", "_score": 0.3125, "_source": { "title": "The quick brown fox jumps over the lazy dog" } } ]
- Elasticsearch - The Definitive Guide:
- Foreword
- Preface
- Getting Started
- You Know, for Search…
- Installing and Running Elasticsearch
- Talking to Elasticsearch
- Document Oriented
- Finding Your Feet
- Indexing Employee Documents
- Retrieving a Document
- Search Lite
- Search with Query DSL
- More-Complicated Searches
- Full-Text Search
- Phrase Search
- Highlighting Our Searches
- Analytics
- Tutorial Conclusion
- Distributed Nature
- Next Steps
- Life Inside a Cluster
- Data In, Data Out
- What Is a Document?
- Document Metadata
- Indexing a Document
- Retrieving a Document
- Checking Whether a Document Exists
- Updating a Whole Document
- Creating a New Document
- Deleting a Document
- Dealing with Conflicts
- Optimistic Concurrency Control
- Partial Updates to Documents
- Retrieving Multiple Documents
- Cheaper in Bulk
- Distributed Document Store
- Searching—The Basic Tools
- Mapping and Analysis
- Full-Body Search
- Sorting and Relevance
- Distributed Search Execution
- Index Management
- Inside a Shard
- You Know, for Search…
- Search in Depth
- Structured Search
- Full-Text Search
- Multifield Search
- Proximity Matching
- Partial Matching
- Controlling Relevance
- Theory Behind Relevance Scoring
- Lucene’s Practical Scoring Function
- Query-Time Boosting
- Manipulating Relevance with Query Structure
- Not Quite Not
- Ignoring TF/IDF
- function_score Query
- Boosting by Popularity
- Boosting Filtered Subsets
- Random Scoring
- The Closer, The Better
- Understanding the price Clause
- Scoring with Scripts
- Pluggable Similarity Algorithms
- Changing Similarities
- Relevance Tuning Is the Last 10%
- Dealing with Human Language
- Aggregations
- Geolocation
- Modeling Your Data
- Administration, Monitoring, and Deployment