WARNING: The 2.x versions of Elasticsearch have passed their EOL dates. If you are running a 2.x version, we strongly advise you to upgrade.
This documentation is no longer maintained and may be removed. For the latest information, see the current Elasticsearch documentation.
Using Language Analyzersedit
The built-in language analyzers are available globally and don’t need to be configured before being used. They can be specified directly in the field mapping:
PUT /my_index { "mappings": { "blog": { "properties": { "title": { "type": "string", "analyzer": "english" } } } } }
Of course, by passing text through the english
analyzer, we lose
information:
We can’t tell if the document mentions one fox
or many foxes
; the word
not
is a stopword and is removed, so we can’t tell whether the document is
happy about foxes or not. By using the english
analyzer, we have increased
recall as we can match more loosely, but we have reduced our ability to rank
documents accurately.
To get the best of both worlds, we can use multifields to
index the title
field twice: once with the english
analyzer and once with
the standard
analyzer:
PUT /my_index { "mappings": { "blog": { "properties": { "title": { "type": "string", "fields": { "english": { "type": "string", "analyzer": "english" } } } } } } }
The main |
|
The |
With this mapping in place, we can index some test documents to demonstrate how to use both fields at query time:
PUT /my_index/blog/1 { "title": "I'm happy for this fox" } PUT /my_index/blog/2 { "title": "I'm not happy about my fox problem" } GET /_search { "query": { "multi_match": { "type": "most_fields", "query": "not happy foxes", "fields": [ "title", "title.english" ] } } }
Use the |
Even though neither of our documents contain the word foxes
, both documents
are returned as results thanks to the word stemming on the title.english
field. The second document is ranked as more relevant, because the word not
matches on the title
field.
- Elasticsearch - The Definitive Guide:
- Foreword
- Preface
- Getting Started
- You Know, for Search…
- Installing and Running Elasticsearch
- Talking to Elasticsearch
- Document Oriented
- Finding Your Feet
- Indexing Employee Documents
- Retrieving a Document
- Search Lite
- Search with Query DSL
- More-Complicated Searches
- Full-Text Search
- Phrase Search
- Highlighting Our Searches
- Analytics
- Tutorial Conclusion
- Distributed Nature
- Next Steps
- Life Inside a Cluster
- Data In, Data Out
- What Is a Document?
- Document Metadata
- Indexing a Document
- Retrieving a Document
- Checking Whether a Document Exists
- Updating a Whole Document
- Creating a New Document
- Deleting a Document
- Dealing with Conflicts
- Optimistic Concurrency Control
- Partial Updates to Documents
- Retrieving Multiple Documents
- Cheaper in Bulk
- Distributed Document Store
- Searching—The Basic Tools
- Mapping and Analysis
- Full-Body Search
- Sorting and Relevance
- Distributed Search Execution
- Index Management
- Inside a Shard
- You Know, for Search…
- Search in Depth
- Structured Search
- Full-Text Search
- Multifield Search
- Proximity Matching
- Partial Matching
- Controlling Relevance
- Theory Behind Relevance Scoring
- Lucene’s Practical Scoring Function
- Query-Time Boosting
- Manipulating Relevance with Query Structure
- Not Quite Not
- Ignoring TF/IDF
- function_score Query
- Boosting by Popularity
- Boosting Filtered Subsets
- Random Scoring
- The Closer, The Better
- Understanding the price Clause
- Scoring with Scripts
- Pluggable Similarity Algorithms
- Changing Similarities
- Relevance Tuning Is the Last 10%
- Dealing with Human Language
- Aggregations
- Geolocation
- Modeling Your Data
- Administration, Monitoring, and Deployment