本地英文版地址: ../en/modules-cross-cluster-search.html
跨集群搜索 允许你针对一个或多个 远程集群 运行单个搜索请求。 例如,你可以使用跨集群搜索来过滤和分析存储在不同数据中心的集群上的日志数据。
跨集群搜索需要 远程集群。
支持的 API
下面的几个 API 支持跨集群搜索:
跨集群搜索的示例
远程集群设置
要执行跨集群搜索,必须至少配置一个远程集群。
下面这个 集群更新设置(cluster update settings) API 请求添加了三个远程集群:cluster_one
、cluster_two
和 cluster_three
。
PUT _cluster/settings { "persistent": { "cluster": { "remote": { "cluster_one": { "seeds": [ "127.0.0.1:9300" ] }, "cluster_two": { "seeds": [ "127.0.0.1:9301" ] }, "cluster_three": { "seeds": [ "127.0.0.1:9302" ] } } } } }
搜索单个远程集群
下面这个 search API 请求在远程集群 cluster_one
上搜索索引 twitter
。
GET /cluster_one:twitter/_search { "query": { "match": { "user": "kimchy" } } }
API 返回以下响应:
{ "took": 150, "timed_out": false, "_shards": { "total": 1, "successful": 1, "failed": 0, "skipped": 0 }, "_clusters": { "total": 1, "successful": 1, "skipped": 0 }, "hits": { "total" : { "value": 1, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "cluster_one:twitter", "_type": "_doc", "_id": "0", "_score": 1, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } } ] } }
搜索多个远程集群
下面这个 search API 请求在三个集群上搜索索引 twitter
:
-
两个远程集群:
cluster_one
和cluster_two
- 本地集群
GET /twitter,cluster_one:twitter,cluster_two:twitter/_search { "query": { "match": { "user": "kimchy" } } }
API 返回以下响应:
{ "took": 150, "timed_out": false, "num_reduce_phases": 4, "_shards": { "total": 3, "successful": 3, "failed": 0, "skipped": 0 }, "_clusters": { "total": 3, "successful": 3, "skipped": 0 }, "hits": { "total" : { "value": 3, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "twitter", "_type": "_doc", "_id": "0", "_score": 2, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } }, { "_index": "cluster_one:twitter", "_type": "_doc", "_id": "0", "_score": 1, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } }, { "_index": "cluster_two:twitter", "_type": "_doc", "_id": "0", "_score": 1, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } } ] } }
跳过不可用集群
默认情况下,如果请求中的任何集群不可用,跨集群搜索将返回错误。
要在跨集群搜索期间跳过不可用的集群,请将集群设置 skip_unavailable
设置为 true
。
下面这个 cluster update settings API 请求将 cluster_two
的 skip_unavailable
设置更改为 true
。
PUT _cluster/settings { "persistent": { "cluster.remote.cluster_two.skip_unavailable": true } }
果在跨集群搜索期间 cluster_two
断开连接或不可用,Elasticsearch 将不会在最终结果中包含该集群匹配的文档。
在嗅探模式(sniff mode)下选择网关和种子节点
对于使用 嗅探连接(sniff connection) 模式的远程集群,需要通过你的网络从本地集群访问网关和种子节点。
默认情况下,任何不符合主节点条件的节点都可以充当网关节点。
如果需要,可以通过将 cluster.remote.node.attr.gateway
设置为 true
。
对于跨集群搜索,我们建议你使用能够充当搜索请求 协调节点 的网关节点。 如果需要,集群的种子节点可以是这些网关节点的子集。
代理模式(proxy mode)下的跨集群搜索
代理模式(Proxy mode) 远程集群连接支持跨集群搜索。
所有远程连接都连接到配置的 proxy_address
。
任何期望的到网关或协调节点的连接路由必须由中间代理在这个配置的地址上实现。
跨集群搜索如何处理网络延迟
因为跨集群搜索涉及到向远程集群发送请求,所以任何网络延迟都会影响搜索速度。 为了避免缓慢的搜索,跨集群搜索提供了两种处理网络延迟的选项:
- 最小化网络往返 (minimize network roundtrips)
-
默认情况下,Elasticsearch 减少了远程集群之间的网络往返次数。 这降低了网络延迟对搜索速度的影响。 然而,Elasticsearch 无法减少大型搜索请求的网络往返次数,例如那些包含 滚动(scroll) 或 内部命中(inner hits) 的请求。
请参阅 最小化网络往返 以了解该选项的工作原理。
- 不要最小化网络往返 (don’t minimize network roundtrips)
-
对于包含 滚动(scroll) 或 内部命中(inner hits) 的搜索请求,Elasticsearch 向每个远程集群发送多个传出和传入请求。 你也可以通过将
ccs_minimize_roundtrips
参数设置为false
来选择此选项。 虽然这种方法通常较慢,但对于低延迟的网络来说可能效果很好。请参阅 不要最小化网络往返 以了解该选项的工作原理。
最小化网络往返 (minimize network roundtrips)
以下是当最大限度地减少网络往返时跨集群搜索的工作方式。
-
向本地集群发送跨集群搜索请求。 该集群中的协调节点接收并解析该请求。
-
协调节点向包括本地集群在内的每个集群发送单个搜索请求。 每个集群独立执行搜索请求,对请求使用自己的集群级设置。
-
每个远程集群将其搜索结果发送回协调节点。
-
从每个集群收集结果后,协调节点在跨集群搜索响应中返回最终结果。
不要最小化网络往返 (don’t minimize network roundtrips)
以下是没有最小化网络往返时,跨集群搜索是如何工作的。
-
向本地集群发送跨集群搜索请求。 该集群中的协调节点接收并解析该请求。
-
协调节点向每个远程集群发送一个 搜索分片(search shards) API 请求。
-
每个远程集群将其响应发送回协调节点。 该响应包含关于跨集群搜索请求将在其上执行的索引和分片的信息。
-
协调节点向每个分片发送搜索请求,包括它自己集群中的分片。 每个分片独立地执行搜索请求。
当网络往返没有最小化时去执行搜索,就好像所有数据都在协调节点所在的群集中。 建议更新限制搜索的集群级设置,如
action.search.shard_count.limit
,pre_filter_shard_size
和max_concurrent_shard_requests
以解决这一问题。 如果这些限制太低,搜索可能会被拒绝。 -
每个分片将其搜索结果发送回协调节点。
-
从每个集群收集结果后,协调节点在跨集群搜索响应中返回最终结果。
- Elasticsearch权威指南: 其他版本:
- Elasticsearch是什么?
- 7.7版本的新特性
- 开始使用Elasticsearch
- 安装和设置
- 升级Elasticsearch
- 搜索你的数据
- 查询领域特定语言(Query DSL)
- SQL access(暂时不翻译)
- Overview
- Getting Started with SQL
- Conventions and Terminology
- Security
- SQL REST API
- SQL Translate API
- SQL CLI
- SQL JDBC
- SQL ODBC
- SQL Client Applications
- SQL Language
- Functions and Operators
- Comparison Operators
- Logical Operators
- Math Operators
- Cast Operators
- LIKE and RLIKE Operators
- Aggregate Functions
- Grouping Functions
- Date/Time and Interval Functions and Operators
- Full-Text Search Functions
- Mathematical Functions
- String Functions
- Type Conversion Functions
- Geo Functions
- Conditional Functions And Expressions
- System Functions
- Reserved keywords
- SQL Limitations
- 聚合
- 度量(metric)聚合
- 桶(bucket)聚合
- adjacency_matrix 聚合
- auto_date_histogram 聚合
- children 聚合
- composite 聚合
- date_histogram 聚合
- date_range 聚合
- diversified_sampler 聚合
- filter 聚合
- filters 聚合
- geo_distance 聚合
- geohash_grid 聚合
- geotile_grid 聚合
- global 聚合
- histogram 聚合
- ip_range 聚合
- missing 聚合
- nested 聚合
- parent 聚合
- range 聚合
- rare_terms 聚合
- reverse_nested 聚合
- sampler 聚合
- significant_terms 聚合
- significant_text 聚合
- terms 聚合
- 给范围字段分桶的微妙之处
- 管道(pipeline)聚合
- 矩阵(matrix)聚合
- 重度缓存的聚合
- 只返回聚合的结果
- 聚合元数据
- Returning the type of the aggregation
- 使用转换对聚合结果进行索引
- 脚本
- 映射
- 删除的映射类型
- 字段数据类型
- alias(别名)
- array(数组)
- binary(二进制)
- boolean(布尔)
- date(日期)
- date_nanos(日期纳秒)
- dense_vector(密集矢量)
- histogram(直方图)
- flattened(扁平)
- geo_point(地理坐标点)
- geo_shape(地理形状)
- IP
- join(联结)
- keyword(关键词)
- nested(嵌套)
- numeric(数值)
- object(对象)
- percolator(渗透器)
- range(范围)
- rank_feature(特征排名)
- rank_features(特征排名)
- search_as_you_type(输入即搜索)
- Sparse vector
- Text
- Token count
- Shape
- Constant keyword
- Meta-Fields
- Mapping parameters
- Dynamic Mapping
- Text analysis
- Overview
- Concepts
- Configure text analysis
- Built-in analyzer reference
- Tokenizer reference
- Char Group Tokenizer
- Classic Tokenizer
- Edge n-gram tokenizer
- Keyword Tokenizer
- Letter Tokenizer
- Lowercase Tokenizer
- N-gram tokenizer
- Path Hierarchy Tokenizer
- Path Hierarchy Tokenizer Examples
- Pattern Tokenizer
- Simple Pattern Tokenizer
- Simple Pattern Split Tokenizer
- Standard Tokenizer
- Thai Tokenizer
- UAX URL Email Tokenizer
- Whitespace Tokenizer
- Token filter reference
- Apostrophe
- ASCII folding
- CJK bigram
- CJK width
- Classic
- Common grams
- Conditional
- Decimal digit
- Delimited payload
- Dictionary decompounder
- Edge n-gram
- Elision
- Fingerprint
- Flatten graph
- Hunspell
- Hyphenation decompounder
- Keep types
- Keep words
- Keyword marker
- Keyword repeat
- KStem
- Length
- Limit token count
- Lowercase
- MinHash
- Multiplexer
- N-gram
- Normalization
- Pattern capture
- Pattern replace
- Phonetic
- Porter stem
- Predicate script
- Remove duplicates
- Reverse
- Shingle
- Snowball
- Stemmer
- Stemmer override
- Stop
- Synonym
- Synonym graph
- Trim
- Truncate
- Unique
- Uppercase
- Word delimiter
- Word delimiter graph
- Character filters reference
- Normalizers
- Index modules
- Ingest node
- Pipeline Definition
- Accessing Data in Pipelines
- Conditional Execution in Pipelines
- Handling Failures in Pipelines
- Enrich your data
- Processors
- Append Processor
- Bytes Processor
- Circle Processor
- Convert Processor
- CSV Processor
- Date Processor
- Date Index Name Processor
- Dissect Processor
- Dot Expander Processor
- Drop Processor
- Enrich Processor
- Fail Processor
- Foreach Processor
- GeoIP Processor
- Grok Processor
- Gsub Processor
- HTML Strip Processor
- Inference Processor
- Join Processor
- JSON Processor
- KV Processor
- Lowercase Processor
- Pipeline Processor
- Remove Processor
- Rename Processor
- Script Processor
- Set Processor
- Set Security User Processor
- Split Processor
- Sort Processor
- Trim Processor
- Uppercase Processor
- URL Decode Processor
- User Agent processor
- ILM: Manage the index lifecycle
- Monitor a cluster
- Frozen indices
- Roll up or transform your data
- Set up a cluster for high availability
- Snapshot and restore
- Secure a cluster
- Overview
- Configuring security
- User authentication
- Built-in users
- Internal users
- Token-based authentication services
- Realms
- Realm chains
- Active Directory user authentication
- File-based user authentication
- LDAP user authentication
- Native user authentication
- OpenID Connect authentication
- PKI user authentication
- SAML authentication
- Kerberos authentication
- Integrating with other authentication systems
- Enabling anonymous access
- Controlling the user cache
- Configuring SAML single-sign-on on the Elastic Stack
- Configuring single sign-on to the Elastic Stack using OpenID Connect
- User authorization
- Built-in roles
- Defining roles
- Security privileges
- Document level security
- Field level security
- Granting privileges for indices and aliases
- Mapping users and groups to roles
- Setting up field and document level security
- Submitting requests on behalf of other users
- Configuring authorization delegation
- Customizing roles and authorization
- Enabling audit logging
- Encrypting communications
- Restricting connections with IP filtering
- Cross cluster search, clients, and integrations
- Tutorial: Getting started with security
- Tutorial: Encrypting communications
- Troubleshooting
- Some settings are not returned via the nodes settings API
- Authorization exceptions
- Users command fails due to extra arguments
- Users are frequently locked out of Active Directory
- Certificate verification fails for curl on Mac
- SSLHandshakeException causes connections to fail
- Common SSL/TLS exceptions
- Common Kerberos exceptions
- Common SAML issues
- Internal Server Error in Kibana
- Setup-passwords command fails due to connection failure
- Failures due to relocation of the configuration files
- Limitations
- Alerting on cluster and index events
- Command line tools
- How To
- Glossary of terms
- REST APIs
- API conventions
- cat APIs
- cat aliases
- cat allocation
- cat anomaly detectors
- cat count
- cat data frame analytics
- cat datafeeds
- cat fielddata
- cat health
- cat indices
- cat master
- cat nodeattrs
- cat nodes
- cat pending tasks
- cat plugins
- cat recovery
- cat repositories
- cat shards
- cat segments
- cat snapshots
- cat task management
- cat templates
- cat thread pool
- cat trained model
- cat transforms
- Cluster APIs
- Cluster allocation explain
- Cluster get settings
- Cluster health
- Cluster reroute
- Cluster state
- Cluster stats
- Cluster update settings
- Nodes feature usage
- Nodes hot threads
- Nodes info
- Nodes reload secure settings
- Nodes stats
- Pending cluster tasks
- Remote cluster info
- Task management
- Voting configuration exclusions
- Cross-cluster replication APIs
- Document APIs
- Enrich APIs
- Explore API
- Index APIs
- Add index alias
- Analyze
- Clear cache
- Clone index
- Close index
- Create index
- Delete index
- Delete index alias
- Delete index template
- Flush
- Force merge
- Freeze index
- Get field mapping
- Get index
- Get index alias
- Get index settings
- Get index template
- Get mapping
- Index alias exists
- Index exists
- Index recovery
- Index segments
- Index shard stores
- Index stats
- Index template exists
- Open index
- Put index template
- Put mapping
- Refresh
- Rollover index
- Shrink index
- Split index
- Synced flush
- Type exists
- Unfreeze index
- Update index alias
- Update index settings
- Index lifecycle management API
- Ingest APIs
- Info API
- Licensing APIs
- Machine learning anomaly detection APIs
- Add events to calendar
- Add jobs to calendar
- Close jobs
- Create jobs
- Create calendar
- Create datafeeds
- Create filter
- Delete calendar
- Delete datafeeds
- Delete events from calendar
- Delete filter
- Delete forecast
- Delete jobs
- Delete jobs from calendar
- Delete model snapshots
- Delete expired data
- Estimate model memory
- Find file structure
- Flush jobs
- Forecast jobs
- Get buckets
- Get calendars
- Get categories
- Get datafeeds
- Get datafeed statistics
- Get influencers
- Get jobs
- Get job statistics
- Get machine learning info
- Get model snapshots
- Get overall buckets
- Get scheduled events
- Get filters
- Get records
- Open jobs
- Post data to jobs
- Preview datafeeds
- Revert model snapshots
- Set upgrade mode
- Start datafeeds
- Stop datafeeds
- Update datafeeds
- Update filter
- Update jobs
- Update model snapshots
- Machine learning data frame analytics APIs
- Create data frame analytics jobs
- Create inference trained model
- Delete data frame analytics jobs
- Delete inference trained model
- Evaluate data frame analytics
- Explain data frame analytics API
- Get data frame analytics jobs
- Get data frame analytics jobs stats
- Get inference trained model
- Get inference trained model stats
- Start data frame analytics jobs
- Stop data frame analytics jobs
- Migration APIs
- Reload search analyzers
- Rollup APIs
- Search APIs
- Security APIs
- Authenticate
- Change passwords
- Clear cache
- Clear roles cache
- Create API keys
- Create or update application privileges
- Create or update role mappings
- Create or update roles
- Create or update users
- Delegate PKI authentication
- Delete application privileges
- Delete role mappings
- Delete roles
- Delete users
- Disable users
- Enable users
- Get API key information
- Get application privileges
- Get builtin privileges
- Get role mappings
- Get roles
- Get token
- Get users
- Has privileges
- Invalidate API key
- Invalidate token
- OpenID Connect Prepare Authentication API
- OpenID Connect authenticate API
- OpenID Connect logout API
- SAML prepare authentication API
- SAML authenticate API
- SAML logout API
- SAML invalidate API
- SSL certificate
- Snapshot and restore APIs
- Snapshot lifecycle management API
- Transform APIs
- Usage API
- Watcher APIs
- Definitions
- Breaking changes
- Release notes
- Elasticsearch version 7.7.1
- Elasticsearch version 7.7.0
- Elasticsearch version 7.6.2
- Elasticsearch version 7.6.1
- Elasticsearch version 7.6.0
- Elasticsearch version 7.5.2
- Elasticsearch version 7.5.1
- Elasticsearch version 7.5.0
- Elasticsearch version 7.4.2
- Elasticsearch version 7.4.1
- Elasticsearch version 7.4.0
- Elasticsearch version 7.3.2
- Elasticsearch version 7.3.1
- Elasticsearch version 7.3.0
- Elasticsearch version 7.2.1
- Elasticsearch version 7.2.0
- Elasticsearch version 7.1.1
- Elasticsearch version 7.1.0
- Elasticsearch version 7.0.0
- Elasticsearch version 7.0.0-rc2
- Elasticsearch version 7.0.0-rc1
- Elasticsearch version 7.0.0-beta1
- Elasticsearch version 7.0.0-alpha2
- Elasticsearch version 7.0.0-alpha1