使用elasticsearch-curator备份Elasticsearch索引

前言

在日常工作中，当我们需要去维护一个elasitcsearch集群以期能稳定工作。通常需要有计划的做很多事情。比如定期的清除数据，合并 segment，备份恢复等。如果我们具备编程能力，这些工作一般都是可以通过各种编程语言根据我们的需求，调用elasticsearch的API可以完成的。

elasticsearch整个生态圈已经很成熟。elastic.co提供的curator这个工具（用python开发的）已经为各种运维场景提供了完善的解决方案，大部分情况下，我们只需要使用curator就可以完成我们的日常需求。

下面就是curator备份Elasticsearch索引

准备过程

这里以nfs为例，如果使用各大云服务商的云存储(例如: AWS s3)类似。

添加Elasticsearch备份存储目录

在集群的每台机器上进行目录创建
```
 $ mkdir /data/backup/elasticsearch_backup
```

挂载共享文件存储目录

在集群的每台机器上目录挂载

nfs4

 $ mount -t nfs 10.9.0.6:/share-7f4ef504-3ddb-40e9-853b-15d495cc9fb1 /data/backup

nfs3

 $ mount -t nfs -o vers=3,nolock,proto=tcp,noresvport 10.9.0.6:/share-7f4ef504-3ddb-40e9-853b-15d495cc9fb1 /data/backup

修改Elasticsearch集群配置

在Elasticsearch集群的每台机器上都添加path.repo属性
```
 path.repo: ["/data/backup/elasticsearch_backup"]
```
配置修改完成后，需要重启Elasticsearch集群(依次重启)
安装elasticsearch-curator

没有必要每台都安装过去, 安装方式不仅限于pip,也可以yum/apt。
```
 $ pip install elasticsearch-curator
```

建立备份仓库

web终端(例如: elasticsearch-head/cerebro)

 PUT _snapshot/elasticsearch_backup
 {
     "type": "fs", 
     "settings": {
         "location": "/data/backup/elasticsearch_backup",
         "compress": true
     }
 }

shell终端

 $ curl -X PUT "10.9.3.16:9200/_snapshot/elasticsearch_backup" -H 'Content-Type: application/json' -d'
 {
     "type": "fs",
     "settings": {
         "location": "/data/backup/elasticsearch_backup",
         "compress": true
     }
 }'

备份数据快照

编辑curator.yml

这里要注意的是 master_only: False参数。如果在elasticsearch集群的全部node上都安装了curator那么需要将这个值修改为master_only: True

  $ vim ./curator/curator.yml

  ---
  # Remember, leave a key empty if there is no value.  None will be a string,
  # not a Python "NoneType"
  client:
  hosts:
      - 10.9.3.16
      - 10.9.3.19
  port: 9200
  url_prefix:
  use_ssl: False
  certificate:
  client_cert:
  client_key:
  aws_key:
  aws_secret_key:
  aws_region:
  ssl_no_validate: False
  http_auth:
  timeout: 60
  master_only: False

  logging:
  loglevel: DEBUG
  logfile: /data/elasticsearch/elasticsearch_plugins/curator/curator.log
  logformat: default
  blacklist: ['elasticsearch', 'urllib3']

编辑action.yml

这个文件是关键。

可以参考下面的例子中实现的功能: 备份前缀为sdk_ | game_且超过31天的索引，其默认快照名称模式为
‘es-%Y%m%d%H%M%S’。等待快照完成。跳过存储库文件系统访问检查

  ---
  # Remember, leave a key empty if there is no value.  None will be a string,
  # not a Python "NoneType"
  #
  # Also remember that all examples have 'disable_action' set to True.  If you
  # want to use this action as a template, be sure to set this to False after
  # copying it.
  actions:
  1:
      action: snapshot
      description: >-
      Snapshot sdk_|game_ prefixed indices older than 31 day (based on index
      creation_date) with the default snapshot name pattern of
      'es-%Y%m%d%H%M%S'.  Wait for the snapshot to complete.  Do not skip
      the repository filesystem access check.  Use the other options to create
      the snapshot.
      options:
      repository: elasticsearch_backup
      # Leaving name blank will result in the default 'curator-%Y%m%d%H%M%S'
      name: es-%Y%m%d%H%M%S
      ignore_unavailable: False
      include_global_state: True
      partial: True
      wait_for_completion: True
      skip_repo_fs_check: True
      ignore_empty_list: True
      continue_if_exception: False
      disable_action: False
      filters:
      - filtertype: pattern
      kind: regex
      value: '^(sdk_|game_).*$'
      - filtertype: age
      source: creation_date
      direction: older
      unit: days
      unit_count: 31

运行备份

试运行

  $ curator --dry-run --config ./curator/curator.yml ./curator/action.yml

运行

  $ curator --config ./curator/curator.yml ./curator/action.yml

后续

备份完成后，可以通过API来查看,如果看到状态 "state": "SUCCESS" 即为成功

$ curl -XGET 'http://10.9.3.16:9200/_snapshot/elasticsearch_backup/squirrel-es-202002241009'

参考文档

https://elasticsearch.cn/article/560

Elasticsearch snapshot 备份的使用方法

https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html

https://www.elastic.co/guide/en/elasticsearch/client/curator/current/examples.html

https://github.com/elastic/curator

https://www.elastic.co/guide/cn/elasticsearch/guide/current/backing-up-your-cluster.html