Index List

Index filtering is performed using the filter_ methods of curator.IndexList

class curator.IndexList(client, search_pattern='_all')

Bases: object

IndexList class

alias_index_check(data)

Check each index in data to see if it’s an alias.

all_indices

All indices in the cluster at instance creation time. Type: list

client

An Elasticsearch client object passed from param client

data_getter(data, exec_func)

Function that prevents unnecessary code repetition for different data getter methods

empty_list_check()

Raise NoIndices if indices is empty

filter_allocated(key=None, value=None, allocation_type='require', exclude=True)

Match indices that have the routing allocation rule of key=value from indices

Parameters:
  • key – The allocation attribute to check for

  • value – The value to check for

  • allocation_type – Type of allocation to apply

  • exclude – If exclude=True, this filter will remove matching indices from indices. If exclude=False, then only matching indices will be kept in indices. Default is T`rue

filter_by_age(source='name', direction=None, timestring=None, unit=None, unit_count=None, field=None, stats_result='min_value', epoch=None, exclude=False, unit_count_pattern=False)

Match indices by relative age calculations.

Parameters:
  • source – Source of index age. Can be one of name, creation_date, or field_stats

  • direction – Time to filter, either older or younger

  • timestring – An time.strftime() string to match the datestamp in an index name. Only used for index filtering by name.

  • unit – One of seconds, minutes, hours, days, weeks, months, or years.

  • unit_count – The count of unit. unit_count * unit will be calculated out to the relative number of seconds.

  • unit_count_pattern – A regular expression whose capture group identifies the value for unit_count.

  • field – A timestamp field name. Only used for field_stats based calculations.

  • stats_result – Either min_value or max_value. Only used in conjunction with source=field_stats to choose whether to reference the minimum or maximum result value.

  • epoch – An epoch timestamp used in conjunction with unit and unit_count to establish a point of reference for calculations. If not provided, the current time will be used.

  • exclude – If exclude=True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is False

filter_by_alias(aliases=None, exclude=False)

Match indices which are associated with the alias or list of aliases identified by aliases. Indices must appear in all aliases in list aliases or a 404 error will result, leading to no indices being matched.

Parameters:
  • aliases (list) – A list of alias names.

  • exclude – If exclude=True, this filter will remove matching indices from indices. If exclude=False, then only matching indices will be kept in indices. Default is False

filter_by_count(count=None, reverse=True, use_age=False, pattern=None, source='creation_date', timestring=None, field=None, stats_result='min_value', exclude=True)

Remove indices from the actionable list beyond the number count, sorted reverse-alphabetically by default. If you set reverse=False, it will be sorted alphabetically.

The default is usually what you will want. If only one kind of index is provided–for example, indices matching logstash-%Y.%m.%d – then reverse alphabetical sorting will mean the oldest will remain in the list, because lower numbers in the dates mean older indices.

By setting reverse=False, then index3 will be deleted before index2, which will be deleted before index1

use_age allows ordering indices by age. Age is determined by the index creation date by default, but you can specify an source of name, max_value, or min_value. The name source requires the timestring argument.

Parameters:
  • count – Filter indices beyond count.

  • reverse – The filtering direction. (default: True).

  • use_age – Sort indices by age. source is required in this case.

  • pattern – Select indices to count from a regular expression pattern. This pattern must have one and only one capture group. This can allow a single count filter instance to operate against any number of matching patterns, and keep count of each index in that group. For example, given a pattern of '^(.*)-\d{6}$', it will match both rollover-000001 and index-999990, but not logstash-2017.10.12. Following the same example, if my cluster also had rollover-000002 through rollover-000010 and index-888888 through index-999999, it will process both groups of indices, and include or exclude the count of each.

  • source – Source of index age. Can be one of name, creation_date, or field_stats. Default: creation_date

  • timestring – An time.strftime() string to match the datestamp in an index name. Only used if source=name.

  • field – A timestamp field name. Only used if source=field_stats.

  • stats_result – Either min_value or max_value. Only used if source=field_stats. It determines whether to reference the minimum or maximum value of field in each index.

  • exclude – If exclude=True, this filter will remove matching indices from indices. If exclude=False, then only matching indices will be kept in indices. Default is True

filter_by_regex(kind=None, value=None, exclude=False)

Match indices by regular expression (pattern).

Parameters:
  • kind – Can be one of: suffix, prefix, regex, or timestring. This option defines what kind of filter you will be building.

  • value – Depends on kind. It is the time.strftime() string if kind is timestring. It’s used to build the regular expression for other kinds.

  • exclude – If exclude=True, this filter will remove matching indices from indices. If exclude=False, then only matching indices will be kept in indices. Default is False

filter_by_shards(number_of_shards=None, shard_filter_behavior='greater_than', exclude=False)

Match indices with a given shard count.

Selects all indices with a shard count greater_than number_of_shards by default. Use shard_filter_behavior to select indices with shard count greater_than, greater_than_or_equal, less_than, less_than_or_equal, or equal to number_of_shards.

Parameters:
  • number_of_shards – shard threshold

  • shard_filter_behavior – Do you want to filter on greater_than, greater_than_or_equal, less_than, less_than_or_equal, or equal?

  • exclude – If exclude=True, this filter will remove matching indices from indices. If exclude=False, then only matching indices will be kept in indices. Default is False

filter_by_size(size_threshold=None, threshold_behavior='greater_than', exclude=False, size_behavior='primary')

Remove indices from the actionable list based on index size.

threshold_behavior, when set to greater_than (default), includes if it the index tests to be larger than size_threshold. When set to less_than, it includes if the index is smaller than size_threshold

Parameters:
  • size_threshold – Filter indices over n gigabytes

  • threshold_behavior – Size to filter, either greater_than or less_than. Defaults to greater_than to preserve backwards compatability.

  • size_behavior – Size that used to filter, either primary or total. Defaults to primary

  • exclude – If exclude=True, this filter will remove matching indices from indices. If exclude=False, then only matching indices will be kept in indices. Default is False

filter_by_space(disk_space=None, reverse=True, use_age=False, source='creation_date', timestring=None, field=None, stats_result='min_value', exclude=False, threshold_behavior='greater_than')

Remove indices from the actionable list based on space consumed, sorted reverse-alphabetically by default. If you set reverse to False, it will be sorted alphabetically.

The default is usually what you will want. If only one kind of index is provided–for example, indices matching logstash-%Y.%m.%d –then reverse alphabetical sorting will mean the oldest will remain in the list, because lower numbers in the dates mean older indices.

By setting reverse to False, then index3 will be deleted before index2, which will be deleted before index1

use_age allows ordering indices by age. Age is determined by the index creation date by default, but you can specify an source of name, max_value, or min_value. The name source requires the timestring argument.

threshold_behavior, when set to greater_than (default), includes if it the index tests to be larger than disk_space. When set to less_than, it includes if the index is smaller than disk_space

Parameters:
  • disk_space – Filter indices over n gigabytes

  • threshold_behavior – Size to filter, either greater_than or less_than. Defaults to greater_than to preserve backwards compatability.

  • reverse – The filtering direction. (default: True). Ignored if use_age is True

  • use_age – Sort indices by age. source is required in this case.

  • source – Source of index age. Can be one of name, creation_date, or field_stats. Default: creation_date

  • timestring – An time.strftime() string to match the datestamp in an index name. Only used if source=name is selected.

  • field – A timestamp field name. Only used if source=field_stats is selected.

  • stats_result – Either min_value or max_value. Only used if source=field_stats is selected. It determines whether to reference the minimum or maximum value of field in each index.

  • exclude – If exclude=True, this filter will remove matching indices from indices. If exclude=False, then only matching indices will be kept in indices. Default is False

filter_closed(exclude=True)

Filter out closed indices from indices

Parameters:

exclude – If exclude=True, this filter will remove matching indices from indices. If exclude=False, then only matching indices will be kept in indices. Default is True

filter_empty(exclude=True)

Filter indices with a document count of zero. Indices that are closed are automatically excluded from consideration due to closed indices reporting a document count of zero.

Parameters:

exclude – If exclude=True, this filter will remove matching indices from indices. If exclude=False, then only matching indices will be kept in indices. Default is True

filter_forceMerged(max_num_segments=None, exclude=True)

Match any index which has max_num_segments per shard or fewer in the actionable list.

Parameters:
  • max_num_segments – Cutoff number of segments per shard.

  • exclude – If exclude=True, this filter will remove matching indices from indices. If exclude=False, then only matching indices will be kept in indices. Default is True

filter_ilm(exclude=True)

Match indices that have the setting index.lifecycle.name

Parameters:

exclude – If exclude=True, this filter will remove matching indices. If exclude=False, then only matching indices will be kept in indices. Default is True

filter_kibana(exclude=True)

Match any index named .kibana* in indices. Older releases addressed index names that no longer exist.

Parameters:

exclude – If exclude=True, this filter will remove matching indices from indices. If exclude=False, then only matching indices will be kept in indices. Default is True

filter_none()

The legendary NULL filter

filter_opened(exclude=True)

Filter out opened indices from indices

Parameters:

exclude – If exclude=True, this filter will remove matching indices from indices. If exclude=False, then only matching indices will be kept in indices. Default is True

filter_period(period_type='relative', source='name', range_from=None, range_to=None, date_from=None, date_to=None, date_from_format=None, date_to_format=None, timestring=None, unit=None, field=None, stats_result='min_value', intersect=False, week_starts_on='sunday', epoch=None, exclude=False)

Match indices with ages within a given period.

Parameters:
  • period_type – Can be either absolute or relative. Default is relative. date_from and date_to are required when using period_type='absolute'. range_from and range_to are required with period_type='relative'.

  • source – Source of index age. Can be name, creation_date, or field_stats

  • range_from – How many unit (s) in the past/future is the origin?

  • range_to – How many unit (s) in the past/future is the end point?

  • date_from – The simplified date for the start of the range

  • date_to – The simplified date for the end of the range. If this value is the same as date_from, the full value of unit will be extrapolated for the range. For example, if unit=months, and date_from and date_to are both 2017.01, then the entire month of January 2017 will be the absolute date range.

  • date_from_format – The time.strftime() string used to parse date_from

  • date_to_format – The time.strftime() string used to parse date_to

  • timestring – An time.strftime() string to match the datestamp in an index name. Only used for index filtering by name.

  • unit – One of hours, days, weeks, months, or years.

  • field – A timestamp field name. Only used for field_stats based calculations.

  • stats_result – Either min_value or max_value. Only used in conjunction with source='field_stats' to choose whether to reference the min or max result value.

  • intersect – Only used when source='field_stats'. If True, only indices where both min_value and max_value are within the period will be selected. If False, it will use whichever you specified. Default is False to preserve expected behavior.

  • week_starts_on – Either sunday or monday. Default is sunday

  • epoch – An epoch timestamp used to establish a point of reference for calculations. If not provided, the current time will be used.

  • exclude – If exclude=True, this filter will remove matching indices from indices. If exclude=False, then only matching indices will be kept in indices. Default is False

get_index_settings()
For each index in self.indices, populate index_info with:

creation_date number_of_replicas number_of_shards routing information (if present)

get_index_state()

For each index in self.indices, populate index_info with:

state (open or closed)

from the cat API

get_index_stats()

Populate index_info with index size_in_bytes, primary_size_in_bytes and doc count information for each index.

get_segment_counts()

Populate index_info with segment information for each index.

index_info

Information extracted from indices, such as segment count, age, etc. Populated at instance creation time by private helper methods. Type: dict

indices

The running list of indices which will be used by one of the actions classes. Populated at instance creation time by private helper methods. Type: list

indices_exist(data, exec_func)

Check if indices exist. If one doesn’t, remove it. Loop until all exist

iterate_filters(filter_dict)

Iterate over the filters defined in config and execute them.

Parameters:

filter_dict – The configuration dictionary

Note

filter_dict should be a dictionary with the following form:

{ 'filters' : [
        {
            'filtertype': 'the_filter_type',
            'key1' : 'value1',
            ...
            'keyN' : 'valueN'
        }
    ]
}
mitigate_alias(index)

Mitigate when an alias is detected instead of an index name

Parameters:

index (str) – The index name that is showing up instead of what was expected

Returns:

No return value:

Return type:

None

needs_data(indices, fields)

Check for data population in self.index_info

population_check(index, key)

Verify that key is in self.index_info[index], and that it is populated

working_list()

Return the current value of indices as copy-by-value to prevent list stomping during iterations