analyzer | ElasticSearch 7.7 权威指南中文版

原英文版地址: https://www.elastic.co/guide/en/elasticsearch/reference/7.7/analyzer.html, 原文档版权归 www.elastic.co 所有
本地英文版地址: ../en/analyzer.html

重要: 此版本不会发布额外的bug修复或文档更新。最新信息请参考当前版本文档。

» » » analyzer

« Mapping parameters boost »

`analyzer`edit

Only text fields support the analyzer mapping parameter.

The analyzer parameter specifies the analyzer used for text analysis when indexing or searching a text field.

Unless overridden with the search_analyzer mapping parameter, this analyzer is used for both index and search analysis. See Specify an analyzer.

We recommend testing analyzers before using them in production. See Test an analyzer.

`search_quote_analyzer`edit

The search_quote_analyzer setting allows you to specify an analyzer for phrases, this is particularly useful when dealing with disabling stop words for phrase queries.

To disable stop words for phrases a field utilising three analyzer settings will be required:

An analyzer setting for indexing all terms including stop words
A search_analyzer setting for non-phrase queries that will remove stop words
A search_quote_analyzer setting for phrase queries that will not remove stop words

PUT my_index
{
   "settings":{
      "analysis":{
         "analyzer":{
            "my_analyzer":{ 
               "type":"custom",
               "tokenizer":"standard",
               "filter":[
                  "lowercase"
               ]
            },
            "my_stop_analyzer":{ 
               "type":"custom",
               "tokenizer":"standard",
               "filter":[
                  "lowercase",
                  "english_stop"
               ]
            }
         },
         "filter":{
            "english_stop":{
               "type":"stop",
               "stopwords":"_english_"
            }
         }
      }
   },
   "mappings":{
       "properties":{
          "title": {
             "type":"text",
             "analyzer":"my_analyzer", 
             "search_analyzer":"my_stop_analyzer", 
             "search_quote_analyzer":"my_analyzer" 
         }
      }
   }
}

PUT my_index/_doc/1
{
   "title":"The Quick Brown Fox"
}

PUT my_index/_doc/2
{
   "title":"A Quick Brown Fox"
}

GET my_index/_search
{
   "query":{
      "query_string":{
         "query":"\"the quick brown fox\"" 
      }
   }
}

	`my_analyzer` analyzer which tokens all terms including stop words
	`my_stop_analyzer` analyzer which removes stop words
	`analyzer` setting that points to the `my_analyzer` analyzer which will be used at index time
	`search_analyzer` setting that points to the `my_stop_analyzer` and removes stop words for non-phrase queries
	`search_quote_analyzer` setting that points to the `my_analyzer` analyzer and ensures that stop words are not removed from phrase queries
	Since the query is wrapped in quotes it is detected as a phrase query therefore the `search_quote_analyzer` kicks in and ensures the stop words are not removed from the query. The `my_analyzer` analyzer will then return the following tokens [`the`, `quick`, `brown`, `fox`] which will match one of the documents. Meanwhile term queries will be analyzed with the `my_stop_analyzer` analyzer which will filter out stop words. So a search for either `The quick brown fox` or `A quick brown fox` will return both documents since both documents contain the following tokens [`quick`, `brown`, `fox`]. Without the `search_quote_analyzer` it would not be possible to do exact matches for phrase queries as the stop words from phrase queries would be removed resulting in both documents matching.

« Mapping parameters boost »

analyzeredit

search_quote_analyzeredit

`analyzer`edit

`search_quote_analyzer`edit