本地英文版地址: ../en/analysis-keep-words-tokenfilter.html
Keep words token filteredit
Keeps only tokens contained in a specified word list.
This filter uses Lucene’s KeepWordFilter.
To remove a list of words from a token stream, use the
stop filter.
Exampleedit
The following analyze API request uses the keep filter to
keep only the fox and dog tokens from
the quick fox jumps over the lazy dog.
GET _analyze
{
"tokenizer": "whitespace",
"filter": [
{
"type": "keep",
"keep_words": [ "dog", "elephant", "fox" ]
}
],
"text": "the quick fox jumps over the lazy dog"
}
The filter produces the following tokens:
[ fox, dog ]
Configurable parametersedit
-
keep_words -
(Required*, array of strings) List of words to keep. Only tokens that match words in this list are included in the output.
Either this parameter or
keep_words_pathmust be specified. -
keep_words_path -
(Required*, array of strings) Path to a file that contains a list of words to keep. Only tokens that match words in this list are included in the output.
This path must be absolute or relative to the
configlocation, and the file must be UTF-8 encoded. Each word in the file must be separated by a line break.Either this parameter or
keep_wordsmust be specified. -
keep_words_case -
(Optional, boolean)
If
true, lowercase all keep words. Defaults tofalse.
Customize and add to an analyzeredit
To customize the keep filter, duplicate it to create the basis for a new
custom token filter. You can modify the filter using its configurable
parameters.
For example, the following create index API request
uses custom keep filters to configure two new
custom analyzers:
-
standard_keep_word_array, which uses a customkeepfilter with an inline array of keep words -
standard_keep_word_file, which uses a customerkeepfilter with a keep words file
PUT keep_words_example
{
"settings": {
"analysis": {
"analyzer": {
"standard_keep_word_array": {
"tokenizer": "standard",
"filter": [ "keep_word_array" ]
},
"standard_keep_word_file": {
"tokenizer": "standard",
"filter": [ "keep_word_file" ]
}
},
"filter": {
"keep_word_array": {
"type": "keep",
"keep_words": [ "one", "two", "three" ]
},
"keep_word_file": {
"type": "keep",
"keep_words_path": "analysis/example_word_list.txt"
}
}
}
}
}