Dynamic templates | ElasticSearch 7.7 权威指南中文版

原英文版地址: https://www.elastic.co/guide/en/elasticsearch/reference/7.7/dynamic-templates.html, 原文档版权归 www.elastic.co 所有
本地英文版地址: ../en/dynamic-templates.html

重要: 此版本不会发布额外的bug修复或文档更新。最新信息请参考当前版本文档。

» » »

« Dynamic field mapping Text analysis »

Dynamic templatesedit

Dynamic templates allow you to define custom mappings that can be applied to dynamically added fields based on:

the datatype detected by Elasticsearch, with match_mapping_type.
the name of the field, with match and unmatch or match_pattern.
the full dotted path to the field, with path_match and path_unmatch.

The original field name {name} and the detected datatype {dynamic_type} template variables can be used in the mapping specification as placeholders.

Dynamic field mappings are only added when a field contains a concrete value — not null or an empty array. This means that if the null_value option is used in a dynamic_template, it will only be applied after the first document with a concrete value for the field has been indexed.

Dynamic templates are specified as an array of named objects:

  "dynamic_templates": [
    {
      "my_template_name": { 
        ...  match conditions ... 
        "mapping": { ... } 
      }
    },
    ...
  ]

	The template name can be any string value.
	The match conditions can include any of : `match_mapping_type`, `match`, `match_pattern`, `unmatch`, `path_match`, `path_unmatch`.
	The mapping that the matched field should use.

If a provided mapping contains an invalid mapping snippet, a validation error is returned. Validation occurs when applying the dynamic template at index time, and, in most cases, when the dynamic template is updated. Providing an invalid mapping snippet may cause the update or validation of a dynamic template to fail under certain conditions:

If no match_mapping_type has been specified but the template is valid for at least one predefined mapping type, the mapping snippet is considered valid. However, a validation error is returned at index time if a field matching the template is indexed as a different type. For example, configuring a dynamic template with no match_mapping_type is considered valid as string type, but if a field matching the dynamic template is indexed as a long, a validation error is returned at index time.
If the {{name}} placeholder is used in the mapping snippet, validation is skipped when updating the dynamic template. This is because the field name is unknown at that time. Instead, validation occurs when the template is applied at index time.

Templates are processed in order — the first matching template wins. When putting new dynamic templates through the put mapping API, all existing templates are overwritten. This allows for dynamic templates to be reordered or deleted after they were initially added.

`match_mapping_type`edit

The match_mapping_type is the datatype detected by the JSON parser. Since JSON doesn’t distinguish a long from an integer or a double from a float, it will always choose the wider datatype, i.e. long for integers and double for floating-point numbers.

The following datatypes may be automatically detected:

boolean when true or false are encountered.
date when date detection is enabled and a string matching any of the configured date formats is found.
double for numbers with a decimal part.
long for numbers without a decimal part.
object for objects, also called hashes.
string for character strings.

* may also be used in order to match all datatypes.

For example, if we wanted to map all integer fields as integer instead of long, and all string fields as both text and keyword, we could use the following template:

PUT my_index
{
  "mappings": {
    "dynamic_templates": [
      {
        "integers": {
          "match_mapping_type": "long",
          "mapping": {
            "type": "integer"
          }
        }
      },
      {
        "strings": {
          "match_mapping_type": "string",
          "mapping": {
            "type": "text",
            "fields": {
              "raw": {
                "type":  "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    ]
  }
}

PUT my_index/_doc/1
{
  "my_integer": 5, 
  "my_string": "Some string" 
}

	The `my_integer` field is mapped as an `integer`.
	The `my_string` field is mapped as a `text`, with a `keyword` multi field.

`match` and `unmatch`edit

The match parameter uses a pattern to match on the field name, while unmatch uses a pattern to exclude fields matched by match.

The following example matches all string fields whose name starts with long_ (except for those which end with _text) and maps them as long fields:

PUT my_index
{
  "mappings": {
    "dynamic_templates": [
      {
        "longs_as_strings": {
          "match_mapping_type": "string",
          "match":   "long_*",
          "unmatch": "*_text",
          "mapping": {
            "type": "long"
          }
        }
      }
    ]
  }
}

PUT my_index/_doc/1
{
  "long_num": "5", 
  "long_text": "foo" 
}

	The `long_num` field is mapped as a `long`.
	The `long_text` field uses the default `string` mapping.

`match_pattern`edit

The match_pattern parameter adjusts the behavior of the match parameter such that it supports full Java regular expression matching on the field name instead of simple wildcards, for instance:

  "match_pattern": "regex",
  "match": "^profit_\d+$"

`path_match` and `path_unmatch`edit

The path_match and path_unmatch parameters work in the same way as match and unmatch, but operate on the full dotted path to the field, not just the final name, e.g. some_object.*.some_field.

This example copies the values of any fields in the name object to the top-level full_name field, except for the middle field:

PUT my_index
{
  "mappings": {
    "dynamic_templates": [
      {
        "full_name": {
          "path_match":   "name.*",
          "path_unmatch": "*.middle",
          "mapping": {
            "type":       "text",
            "copy_to":    "full_name"
          }
        }
      }
    ]
  }
}

PUT my_index/_doc/1
{
  "name": {
    "first":  "John",
    "middle": "Winston",
    "last":   "Lennon"
  }
}

Note that the path_match and path_unmatch parameters match on object paths in addition to leaf fields. As an example, indexing the following document will result in an error because the path_match setting also matches the object field name.title, which can’t be mapped as text:

PUT my_index/_doc/2
{
  "name": {
    "first":  "Paul",
    "last":   "McCartney",
    "title": {
      "value": "Sir",
      "category": "order of chivalry"
    }
  }
}

`{name}` and `{dynamic_type}`edit

The {name} and {dynamic_type} placeholders are replaced in the mapping with the field name and detected dynamic type. The following example sets all string fields to use an analyzer with the same name as the field, and disables doc_values for all non-string fields:

PUT my_index
{
  "mappings": {
    "dynamic_templates": [
      {
        "named_analyzers": {
          "match_mapping_type": "string",
          "match": "*",
          "mapping": {
            "type": "text",
            "analyzer": "{name}"
          }
        }
      },
      {
        "no_doc_values": {
          "match_mapping_type":"*",
          "mapping": {
            "type": "{dynamic_type}",
            "doc_values": false
          }
        }
      }
    ]
  }
}

PUT my_index/_doc/1
{
  "english": "Some English text", 
  "count":   5 
}

	The `english` field is mapped as a `string` field with the `english` analyzer.
	The `count` field is mapped as a `long` field with `doc_values` disabled.

Template examplesedit

Here are some examples of potentially useful dynamic templates:

Structured searchedit

By default Elasticsearch will map string fields as a text field with a sub keyword field. However if you are only indexing structured content and not interested in full text search, you can make Elasticsearch map your fields only as `keyword`s. Note that this means that in order to search those fields, you will have to search on the exact same value that was indexed.

PUT my_index
{
  "mappings": {
    "dynamic_templates": [
      {
        "strings_as_keywords": {
          "match_mapping_type": "string",
          "mapping": {
            "type": "keyword"
          }
        }
      }
    ]
  }
}

`text`-only mappings for stringsedit

On the contrary to the previous example, if the only thing that you care about on your string fields is full-text search, and if you don’t plan on running aggregations, sorting or exact search on your string fields, you could tell Elasticsearch to map it only as a text field (which was the default behaviour before 5.0):

PUT my_index
{
  "mappings": {
    "dynamic_templates": [
      {
        "strings_as_text": {
          "match_mapping_type": "string",
          "mapping": {
            "type": "text"
          }
        }
      }
    ]
  }
}

Disabled normsedit

Norms are index-time scoring factors. If you do not care about scoring, which would be the case for instance if you never sort documents by score, you could disable the storage of these scoring factors in the index and save some space.

PUT my_index
{
  "mappings": {
    "dynamic_templates": [
      {
        "strings_as_keywords": {
          "match_mapping_type": "string",
          "mapping": {
            "type": "text",
            "norms": false,
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    ]
  }
}

The sub keyword field appears in this template to be consistent with the default rules of dynamic mappings. Of course if you do not need them because you don’t need to perform exact search or aggregate on this field, you could remove it as described in the previous section.

Time-seriesedit

When doing time series analysis with Elasticsearch, it is common to have many numeric fields that you will often aggregate on but never filter on. In such a case, you could disable indexing on those fields to save disk space and also maybe gain some indexing speed:

PUT my_index
{
  "mappings": {
    "dynamic_templates": [
      {
        "unindexed_longs": {
          "match_mapping_type": "long",
          "mapping": {
            "type": "long",
            "index": false
          }
        }
      },
      {
        "unindexed_doubles": {
          "match_mapping_type": "double",
          "mapping": {
            "type": "float", 
            "index": false
          }
        }
      }
    ]
  }
}

Like the default dynamic mapping rules, doubles are mapped as floats, which are usually accurate enough, yet require half the disk space.

« Dynamic field mapping Text analysis »

Dynamic templatesedit

match_mapping_typeedit

match and unmatchedit

match_patternedit

path_match and path_unmatchedit

{name} and {dynamic_type}edit