前言
在日常工作中,当我们需要去维护一个elasitcsearch集群以期能稳定工作。通常需要有计划的做很多事情。比如定期的清除数据,合并 segment,备份恢复等。如果我们具备编程能力,这些工作一般都是可以通过各种编程语言根据我们的需求,调用elasticsearch的API可以完成的。
elasticsearch整个生态圈已经很成熟。elastic.co提供的curator这个工具(用python开发的)已经为各种运维场景提供了完善的解决方案,大部分情况下,我们只需要使用curator就可以完成我们的日常需求。
下面就是curator备份Elasticsearch索引
准备过程
这里以nfs为例,如果使用各大云服务商的云存储(例如: AWS s3)类似。
添加Elasticsearch备份存储目录
在集群的每台机器上进行目录创建
$ mkdir /data/backup/elasticsearch_backup
挂载共享文件存储目录
在集群的每台机器上目录挂载
nfs4
$ mount -t nfs 10.9.0.6:/share-7f4ef504-3ddb-40e9-853b-15d495cc9fb1 /data/backup
nfs3
$ mount -t nfs -o vers=3,nolock,proto=tcp,noresvport 10.9.0.6:/share-7f4ef504-3ddb-40e9-853b-15d495cc9fb1 /data/backup
修改Elasticsearch集群配置
在Elasticsearch集群的每台机器上都添加path.repo属性
path.repo: ["/data/backup/elasticsearch_backup"]
配置修改完成后,需要重启Elasticsearch集群(依次重启)
安装elasticsearch-curator
没有必要每台都安装过去, 安装方式不仅限于pip,也可以yum/apt。
$ pip install elasticsearch-curator
建立备份仓库
web终端(例如: elasticsearch-head/cerebro)
PUT _snapshot/elasticsearch_backup { "type": "fs", "settings": { "location": "/data/backup/elasticsearch_backup", "compress": true } }
shell终端
$ curl -X PUT "10.9.3.16:9200/_snapshot/elasticsearch_backup" -H 'Content-Type: application/json' -d' { "type": "fs", "settings": { "location": "/data/backup/elasticsearch_backup", "compress": true } }'
备份数据快照
编辑curator.yml
这里要注意的是
master_only: False
参数。如果在elasticsearch集群的全部node上都安装了curator那么需要将这个值修改为master_only: True
$ vim ./curator/curator.yml --- # Remember, leave a key empty if there is no value. None will be a string, # not a Python "NoneType" client: hosts: - 10.9.3.16 - 10.9.3.19 port: 9200 url_prefix: use_ssl: False certificate: client_cert: client_key: aws_key: aws_secret_key: aws_region: ssl_no_validate: False http_auth: timeout: 60 master_only: False logging: loglevel: DEBUG logfile: /data/elasticsearch/elasticsearch_plugins/curator/curator.log logformat: default blacklist: ['elasticsearch', 'urllib3']
编辑action.yml
这个文件是关键。
可以参考下面的例子中实现的功能: 备份前缀为sdk_ | game_且超过31天的索引,其默认快照名称模式为
‘es-%Y%m%d%H%M%S’。等待快照完成。跳过存储库文件系统访问检查--- # Remember, leave a key empty if there is no value. None will be a string, # not a Python "NoneType" # # Also remember that all examples have 'disable_action' set to True. If you # want to use this action as a template, be sure to set this to False after # copying it. actions: 1: action: snapshot description: >- Snapshot sdk_|game_ prefixed indices older than 31 day (based on index creation_date) with the default snapshot name pattern of 'es-%Y%m%d%H%M%S'. Wait for the snapshot to complete. Do not skip the repository filesystem access check. Use the other options to create the snapshot. options: repository: elasticsearch_backup # Leaving name blank will result in the default 'curator-%Y%m%d%H%M%S' name: es-%Y%m%d%H%M%S ignore_unavailable: False include_global_state: True partial: True wait_for_completion: True skip_repo_fs_check: True ignore_empty_list: True continue_if_exception: False disable_action: False filters: - filtertype: pattern kind: regex value: '^(sdk_|game_).*$' - filtertype: age source: creation_date direction: older unit: days unit_count: 31
运行备份
试运行
$ curator --dry-run --config ./curator/curator.yml ./curator/action.yml
运行
$ curator --config ./curator/curator.yml ./curator/action.yml
后续
备份完成后,可以通过API来查看,如果看到状态 "state": "SUCCESS"
即为成功
$ curl -XGET 'http://10.9.3.16:9200/_snapshot/elasticsearch_backup/squirrel-es-202002241009'
参考文档
https://elasticsearch.cn/article/560
Elasticsearch snapshot 备份的使用方法
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html
https://www.elastic.co/guide/en/elasticsearch/client/curator/current/examples.html
https://github.com/elastic/curator
https://www.elastic.co/guide/cn/elasticsearch/guide/current/backing-up-your-cluster.html