Uppercase token filter | ElasticSearch 7.7 权威指南中文版

原英文版地址: https://www.elastic.co/guide/en/elasticsearch/reference/7.7/analysis-uppercase-tokenfilter.html, 原文档版权归 www.elastic.co 所有
本地英文版地址: ../en/analysis-uppercase-tokenfilter.html

重要: 此版本不会发布额外的bug修复或文档更新。最新信息请参考当前版本文档。

» » »

« Unique token filter Word delimiter token filter »

Uppercase token filteredit

Changes token text to uppercase. For example, you can use the uppercase filter to change the Lazy DoG to THE LAZY DOG.

This filter uses Lucene’s UpperCaseFilter.

Depending on the language, an uppercase character can map to multiple lowercase characters. Using the uppercase filter could result in the loss of lowercase character information.

To avoid this loss but still have a consistent letter case, use the lowercase filter instead.

Exampleedit

The following analyze API request uses the default uppercase filter to change the the Quick FoX JUMPs to uppercase:

GET _analyze
{
  "tokenizer" : "standard",
  "filter" : ["uppercase"],
  "text" : "the Quick FoX JUMPs"
}

The filter produces the following tokens:

[ THE, QUICK, FOX, JUMPS ]

Add to an analyzeredit

The following create index API request uses the uppercase filter to configure a new custom analyzer.

PUT uppercase_example
{
    "settings" : {
        "analysis" : {
            "analyzer" : {
                "whitespace_uppercase" : {
                    "tokenizer" : "whitespace",
                    "filter" : ["uppercase"]
                }
            }
        }
    }
}

« Unique token filter Word delimiter token filter »