[PYTHON] Solution if elasticsearch-curator throws DistributionNotFound error on EC2

What is Curator (elasticsearch-curator)?

It is an Index maintenance tool of elasticsearch officially developed by Elasticsearch.

Errors and solutions

Install Curator

$ python --version
Python 2.6.9

$ pip --version
pip 1.5.6 from /usr/lib/python2.6/site-packages (python 2.6)

# 2014/The latest version as of 12 is v2.0.2
#Completed without problems until installation
$ sudo pip install elasticsearch-curator==2.0.2

DistributionNotFound error

An error occurred when I tried to check the operation after the installation was completed.

$ curator -v
Traceback (most recent call last):
  File "/usr/bin/curator", line 5, in <module>
    from pkg_resources import load_entry_point
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in <module>
    working_set.require(__requires__)
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve
    raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: elasticsearch>=1.0.0,<2.0.0

The content of the error message is that Python's elasticsearch module cannot be found, but it can be installed together as a dependent module of elasticsearch-curator.

$ pip list | grep elasticsearch
elasticsearch (1.2.0)
elasticsearch-curator (2.0.2)

Similar issues have been reported on Github and Google Groups, but none seem to have a clear solution ...

In fact, I also tried reinstalling the elasticsearch module, but it didn't work.

solution

As a last resort mentioned at https://github.com/elasticsearch/curator/issues/56#issuecomment-51628636

At worst, you'll have to run it from /Library/Python/2.7/site-packages/curator/curator.py (which is where pip installs it on mine).

To try.

(According to https://github.com/elasticsearch/curator/issues/56#issuecomment-56980489, you can use ** curator_script.py ** instead of curator.py in v2 system.)

First in ** --dry-run ** mode

$ python /usr/lib/python2.6/site-packages/curator/curator_script.py -v
curator_script.py 2.0.2

# DRY RUN
$ python /usr/lib/python2.6/site-packages/curator/curator_script.py --host localhost --port 9200 --logfile /tmp/curator.log --loglevel DEBUG --dry-run delete --older-than 30

$ less /tmp/curator.log
2014-12-16 04:00:58,990 INFO      Job starting...
2014-12-16 04:00:58,990 INFO      DRY RUN MODE.  No changes will be made.
2014-12-16 04:00:59,187 DEBUG     Detected Elasticsearch version 1.4.1
2014-12-16 04:00:59,187 DEBUG     Setting default timestring for days to %Y.%m.%d
2014-12-16 04:00:59,187 DEBUG     Matching indices with pattern: logstash-%Y.%m.%d
2014-12-16 04:00:59,187 DEBUG     argdict = {'url_prefix': '', 'func': <function delete at 0x262c758>, 'prefix': 'logstash-', 'log_level': 'DEBUG', 'timestring': '%Y.%m.%d', 'dry_run': True, 'exclude_pattern': None, 'logformat': 'Default', 'auth': None, 'ssl': False, 'host': 'localhost', 'command': 'delete', 'time_unit': 'days', 'timeout': 30, 'debug': False, 'disk_space': None, 'log_file': '/tmp/curator.log', 'master_only': False, 'port': 9200, 'older_than': 30, 'suffix': ''}
2014-12-16 04:00:59,188 INFO      DRY RUN: Deleting indices...
2014-12-16 04:00:59,394 INFO      logstash-2014.12.11 is within the threshold period (30 days).
2014-12-16 04:00:59,420 INFO      logstash-2014.12.12 is within the threshold period (30 days).
2014-12-16 04:00:59,420 INFO      logstash-2014.12.13 is within the threshold period (30 days).
2014-12-16 04:00:59,420 INFO      logstash-2014.12.14 is within the threshold period (30 days).
2014-12-16 04:00:59,420 INFO      logstash-2014.12.15 is within the threshold period (30 days).
2014-12-16 04:00:59,420 INFO      logstash-2014.12.16 is within the threshold period (30 days).
2014-12-16 04:00:59,421 INFO      DRY RUN: Speficied indices deleted.
2014-12-16 04:00:59,421 INFO      Done in 0:00:00.516716.

Since it seems to be good, execute it for the actual Elasticsearch Index ...

bloom command

$ python /usr/lib/python2.6/site-packages/curator/curator_script.py --host localhost --port 9200 --logfile /tmp/curator.log --loglevel DEBUG bloom --older-than 3

$ less /tmp/curator.log
2014-12-17 01:00:04,630 INFO      Job starting...
2014-12-17 01:00:04,646 DEBUG     Detected Elasticsearch version 1.4.1
2014-12-17 01:00:04,646 DEBUG     Setting default timestring for days to %Y.%m.%d
2014-12-17 01:00:04,646 DEBUG     Matching indices with pattern: logstash-%Y.%m.%d
2014-12-17 01:00:04,646 DEBUG     argdict = {'url_prefix': '', 'prefix': 'logstash-', 'log_level': 'DEBUG', 'timestring': '%Y.%m.%d', 'dry_run': False, 'exclude_pattern': None, 'logformat': 'Default', 'auth': None, 'ssl': False, 'host': 'localhost', 'command': 'bloom', 'time_unit': 'days', 'timeout': 30, 'debug': False, 'func': <function bloom at 0xc49668>, 'log_file': '/tmp/curator.log', 'master_only': False, 'port': 9200, 'older_than': 3, 'suffix': ''}
2014-12-17 01:00:04,647 INFO      Disabling the bloom filter cache for indices...
2014-12-17 01:00:04,895 INFO      disable_bloom_filter operation succeeded on logstash-2014.12.11
2014-12-17 01:00:05,106 INFO      disable_bloom_filter operation succeeded on logstash-2014.12.12
2014-12-17 01:00:05,428 INFO      disable_bloom_filter operation succeeded on logstash-2014.12.13
2014-12-17 01:00:05,888 INFO      disable_bloom_filter operation succeeded on logstash-2014.12.14
2014-12-17 01:00:05,888 INFO      logstash-2014.12.15 is within the threshold period (3 days).
2014-12-17 01:00:05,888 INFO      logstash-2014.12.16 is within the threshold period (3 days).
2014-12-17 01:00:05,888 INFO      logstash-2014.12.17 is within the threshold period (3 days).
2014-12-17 01:00:05,888 INFO      Disabled bloom filter cache for specified indices.
2014-12-17 01:00:05,889 INFO      Done in 0:00:01.415160.

close command

$ python /usr/lib/python2.6/site-packages/curator/curator_script.py --host localhost --port 9200 --logfile /tmp/curator.log --loglevel DEBUG close --older-than 4

$ less /tmp/curator.log
2014-12-17 01:00:06,946 INFO      Job starting...
2014-12-17 01:00:07,024 DEBUG     Detected Elasticsearch version 1.4.1
2014-12-17 01:00:07,024 DEBUG     Setting default timestring for days to %Y.%m.%d
2014-12-17 01:00:07,024 DEBUG     Matching indices with pattern: logstash-%Y.%m.%d
2014-12-17 01:00:07,025 DEBUG     argdict = {'url_prefix': '', 'prefix': 'logstash-', 'log_level': 'DEBUG', 'timestring': '%Y.%m.%d', 'dry_run': False, 'exclude_pattern': None, 'logformat': 'Default', 'auth': None, 'ssl': False, 'host': 'localhost', 'command': 'close', 'time_unit': 'days', 'timeout': 30, 'debug': False, 'func': <function close at 0x2b0c6e0>, 'log_file': '/tmp/curator.log', 'master_only': False, 'port': 9200, 'older_than': 4, 'suffix': ''}
2014-12-17 01:00:07,025 INFO      Closing indices...
2014-12-17 01:00:08,457 INFO      close_index operation succeeded on logstash-2014.12.11
2014-12-17 01:00:08,841 INFO      close_index operation succeeded on logstash-2014.12.12
2014-12-17 01:00:09,225 INFO      close_index operation succeeded on logstash-2014.12.13
2014-12-17 01:00:09,225 INFO      logstash-2014.12.14 is within the threshold period (4 days).
2014-12-17 01:00:09,226 INFO      logstash-2014.12.15 is within the threshold period (4 days).
2014-12-17 01:00:09,226 INFO      logstash-2014.12.16 is within the threshold period (4 days).
2014-12-17 01:00:09,226 INFO      logstash-2014.12.17 is within the threshold period (4 days).
2014-12-17 01:00:09,226 INFO      Closed specified indices.
2014-12-17 01:00:09,226 INFO      Done in 0:00:02.293359.

delete command

$ python /usr/lib/python2.6/site-packages/curator/curator_script.py --host localhost --port 9200 --logfile /tmp/curator.log --loglevel DEBUG delete --older-than 5

$ less /tmp/curator.log
2014-12-17 01:00:10,375 INFO      Job starting...
2014-12-17 01:00:10,382 DEBUG     Detected Elasticsearch version 1.4.1
2014-12-17 01:00:10,382 DEBUG     Setting default timestring for days to %Y.%m.%d
2014-12-17 01:00:10,382 DEBUG     Matching indices with pattern: logstash-%Y.%m.%d
2014-12-17 01:00:10,382 DEBUG     argdict = {'url_prefix': '', 'func': <function delete at 0x2578758>, 'prefix': 'logstash-', 'log_level': 'DEBUG', 'timestring': '%Y.%m.%d', 'dry_run': False, 'exclude_pattern': None, 'logformat': 'Default', 'auth': None, 'ssl': False, 'host': 'localhost', 'command': 'delete', 'time_unit': 'days', 'timeout': 30, 'debug': False, 'disk_space': None, 'log_file': '/tmp/curator.log', 'master_only': False, 'port': 9200, 'older_than': 5, 'suffix': ''}
2014-12-17 01:00:10,382 INFO      Deleting indices...
2014-12-17 01:00:11,632 INFO      delete_index operation succeeded on logstash-2014.12.11
2014-12-17 01:00:11,727 INFO      delete_index operation succeeded on logstash-2014.12.12
2014-12-17 01:00:11,727 INFO      logstash-2014.12.13 is within the threshold period (5 days).
2014-12-17 01:00:11,727 INFO      logstash-2014.12.14 is within the threshold period (5 days).
2014-12-17 01:00:11,727 INFO      logstash-2014.12.15 is within the threshold period (5 days).
2014-12-17 01:00:11,728 INFO      logstash-2014.12.16 is within the threshold period (5 days).
2014-12-17 01:00:11,728 INFO      logstash-2014.12.17 is within the threshold period (5 days).
2014-12-17 01:00:11,728 INFO      Speficied indices deleted.
2014-12-17 01:00:11,728 INFO      Done in 0:00:01.437909.

It's done! ٩ (๑´3`๑) ۶

bonus

It's a good idea to set it to alias so that the location of Python's site-packages is also dynamically determined.

alias curator="python "`python -c "from distutils.sysconfig import get_python_lib; print get_python_lib()"`/curator/curator_script.py

Recommended Posts

Solution if elasticsearch-curator throws DistributionNotFound error on EC2
Solution if you crash when using selenium on heroku