evolutionary_feature_selection.EvolutionaryFeatureSelection

class evolutionary_feature_selection.EvolutionaryFeatureSelection(predictor=None, scoring=None, n_features=1, population_size=10, n_breeders=5, mutation_rate=0.5, n_mutation_features=1, generations=10, initial_population=None, random_state=None, population_trace=False, fitness_trace=False, n_jobs=1)[source]

A feature selection transformer that selects a set of features of given size by evolutionary optimization

The optimization is performed in a procreation-mutation-selection cycle with the fitness calculated as the validation score of the predictor fitted using a particular selection of features.

Parameters:
predictorpredictor

predictor to train on the reduced feature set

scoringcallable

scoring function to use as fitness function for specimen selection

n_featuresint

number of features to select per specimen

population_sizeint

number of specimens in the population

n_breedersint

number of specimens from the population from which breeders are selected. must be < population_size

mutation_ratefloat 0 <= mutation_rate <= 1

fraction of offspring per generation that are mutated

n_mutation_featuresint >0, <n_features, default = 0.1 * n_features

number of features to swap upon mutation.

generationsint > 0

number of generations

initial_populationarray-like shape(N, n_features)

initial feature selections. If N < population_size, random specimens will be generated to reach population_size, if N > population_size, the first population_size specimens from initial_population will be used. Useful for continuing previous evolutionary feature selection runs.

random_statenumpy.random.random_state

random state for repeatability in testing

fitness_trace: bool, default False

if True, store trace of fitness values during fit. Use with care, may lead to large memory consumption.

population_tracebool, default False

if True, store trace of populations during fit. Use with care, may lead to large memory consumption.

n_jobsint, default=1

number of parallel jobs to use for compute heavy part of fit

Attributes:
random_state_Numpy random state

For testing and consistent parallel processing

population_Ndarray shape(population_size, n_features_in)

The full population of feature masks in the current iteration/generation

fitness_values_Ndarray, shape(population_size)

The fitness/score values of the population

current_specimen_Ndarray, shape(n_features_in)

The specimen with the highest fitness value, same as population_[0]

fitness_history_Ndarray, shape(generations, population_size)

Trace of the population fitness values over along all generations for fit debugging and quality assessment

population_history_Ndarray, shape(generations, populations_size, n_features_in)
__init__(predictor=None, scoring=None, n_features=1, population_size=10, n_breeders=5, mutation_rate=0.5, n_mutation_features=1, generations=10, initial_population=None, random_state=None, population_trace=False, fitness_trace=False, n_jobs=1)[source]

Examples using evolutionary_feature_selection.EvolutionaryFeatureSelection

Evolutionary Feature Selection Transformer - Fitness of Specimens

Evolutionary Feature Selection Transformer - Fitness of Specimens

Evolutionary Feature Selection Transformer - Fitness of Specimens

Evolutionary Feature Selection Transformer - Fitness of Specimens

Evolutionary Feature Selection Transformer - Features in Population

Evolutionary Feature Selection Transformer - Features in Population