Aldebaran

人生最棒的感觉,就是你做到别人说你做不到的事。

0%

使用elasticsearch-curator备份Elasticsearch索引

生田絵梨花 - 白石麻衣

前言

在日常工作中,当我们需要去维护一个elasitcsearch集群以期能稳定工作。通常需要有计划的做很多事情。比如定期的清除数据,合并 segment,备份恢复等。如果我们具备编程能力,这些工作一般都是可以通过各种编程语言根据我们的需求,调用elasticsearch的API可以完成的。

elasticsearch整个生态圈已经很成熟。elastic.co提供的curator这个工具(用python开发的)已经为各种运维场景提供了完善的解决方案,大部分情况下,我们只需要使用curator就可以完成我们的日常需求。

下面就是curator备份Elasticsearch索引

准备过程

这里以nfs为例,如果使用各大云服务商的云存储(例如: AWS s3)类似。

  1. 添加Elasticsearch备份存储目录

    在集群的每台机器上进行目录创建

    $ mkdir /data/backup/elasticsearch_backup
  2. 挂载共享文件存储目录

    在集群的每台机器上目录挂载

    nfs4

    $ mount -t nfs 10.9.0.6:/share-7f4ef504-3ddb-40e9-853b-15d495cc9fb1 /data/backup

    nfs3

    $ mount -t nfs -o vers=3,nolock,proto=tcp,noresvport 10.9.0.6:/share-7f4ef504-3ddb-40e9-853b-15d495cc9fb1 /data/backup
  3. 修改Elasticsearch集群配置

    在Elasticsearch集群的每台机器上都添加path.repo属性

    path.repo: ["/data/backup/elasticsearch_backup"]

    配置修改完成后,需要重启Elasticsearch集群(依次重启)

  4. 安装elasticsearch-curator

    没有必要每台都安装过去, 安装方式不仅限于pip,也可以yum/apt。

    $ pip install elasticsearch-curator
  5. 建立备份仓库

    web终端(例如: elasticsearch-head/cerebro)

    PUT _snapshot/elasticsearch_backup
    {
        "type": "fs", 
        "settings": {
            "location": "/data/backup/elasticsearch_backup",
            "compress": true
        }
    }

    shell终端

    $ curl -X PUT "10.9.3.16:9200/_snapshot/elasticsearch_backup" -H 'Content-Type: application/json' -d'
    {
        "type": "fs",
        "settings": {
            "location": "/data/backup/elasticsearch_backup",
            "compress": true
        }
    }'

备份数据快照

  • 编辑curator.yml

    这里要注意的是 master_only: False参数。如果在elasticsearch集群的全部node上都安装了curator那么需要将这个值修改为master_only: True

    $ vim ./curator/curator.yml
    
    ---
    # Remember, leave a key empty if there is no value.  None will be a string,
    # not a Python "NoneType"
    client:
    hosts:
        - 10.9.3.16
        - 10.9.3.19
    port: 9200
    url_prefix:
    use_ssl: False
    certificate:
    client_cert:
    client_key:
    aws_key:
    aws_secret_key:
    aws_region:
    ssl_no_validate: False
    http_auth:
    timeout: 60
    master_only: False
    
    logging:
    loglevel: DEBUG
    logfile: /data/elasticsearch/elasticsearch_plugins/curator/curator.log
    logformat: default
    blacklist: ['elasticsearch', 'urllib3']
  • 编辑action.yml

    这个文件是关键。

    可以参考下面的例子中实现的功能: 备份前缀为sdk_ | game_且超过31天的索引,其默认快照名称模式为
    ‘es-%Y%m%d%H%M%S’。等待快照完成。跳过存储库文件系统访问检查

    ---
    # Remember, leave a key empty if there is no value.  None will be a string,
    # not a Python "NoneType"
    #
    # Also remember that all examples have 'disable_action' set to True.  If you
    # want to use this action as a template, be sure to set this to False after
    # copying it.
    actions:
    1:
        action: snapshot
        description: >-
        Snapshot sdk_|game_ prefixed indices older than 31 day (based on index
        creation_date) with the default snapshot name pattern of
        'es-%Y%m%d%H%M%S'.  Wait for the snapshot to complete.  Do not skip
        the repository filesystem access check.  Use the other options to create
        the snapshot.
        options:
        repository: elasticsearch_backup
        # Leaving name blank will result in the default 'curator-%Y%m%d%H%M%S'
        name: es-%Y%m%d%H%M%S
        ignore_unavailable: False
        include_global_state: True
        partial: True
        wait_for_completion: True
        skip_repo_fs_check: True
        ignore_empty_list: True
        continue_if_exception: False
        disable_action: False
        filters:
        - filtertype: pattern
        kind: regex
        value: '^(sdk_|game_).*$'
        - filtertype: age
        source: creation_date
        direction: older
        unit: days
        unit_count: 31
  • 运行备份

    试运行

    $ curator --dry-run --config ./curator/curator.yml ./curator/action.yml

    运行

    $ curator --config ./curator/curator.yml ./curator/action.yml

后续

备份完成后,可以通过API来查看,如果看到状态 "state": "SUCCESS" 即为成功

$ curl -XGET 'http://10.9.3.16:9200/_snapshot/elasticsearch_backup/squirrel-es-202002241009'

参考文档

https://elasticsearch.cn/article/560

Elasticsearch snapshot 备份的使用方法

https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html

https://www.elastic.co/guide/en/elasticsearch/client/curator/current/examples.html

https://github.com/elastic/curator

https://www.elastic.co/guide/cn/elasticsearch/guide/current/backing-up-your-cluster.html