Feature selection (visualime.feature_selection)

visualime.feature_selection.select_by_weight(samples: ~numpy.ndarray, predictions: ~numpy.ndarray, label_idx: int, model_type: ~typing.Literal['linear_regression', 'lasso', 'ridge', 'bayesian_ridge', 'bayesian_ridge_fixed_lambda', 'bayesian_ridge_fixed_alpha_lambda'] = 'bayesian_ridge', model_params: ~typing.Dict[str, ~typing.Any] | None = None, distances: ~numpy.ndarray | None = None, kernel: ~typing.Callable[[~numpy.ndarray], ~numpy.ndarray] = <function exponential_kernel>, num_segments_to_select: int | None = None) List[int][source]

Select the num_segments_to_select segments with the highest weight.

Parameters:
samplesnp.ndarray

The samples generated by visualime.lime.generate_samples(): An array of shape (num_of_samples, num_of_segments).

predictionsnp.ndarray

The predictions produced by visualime.lime.predict_images(): An array of shape (num_of_samples, num_of_classes).

label_idxint

The index of the label to explain in the output of predict_fn(). Can be the class predicted by the model, or a different class.

model_typestr

The type of linear model to fit. Available options are: “linear_regression”, “lasso”, “ridge”, “bayesian_ridge”, “bayesian_ridge_fixed_lambda”, and “bayesian_ridge_fixed_alpha_lambda”.

See the scikit-learn documentation for details on each of the models.

model_paramsdict, optional

Parameters to pass to the model during instantiation.

See the scikit-learn documentation for details on each of the models.

It is generally advisable to use the same model as for the final visualime.lime.weigh_segments() function.

distancesnp.ndarray, optional

The distances between the images and the original images used as sample weights when fitting the linear model.

If not given, the cosine distance between a sample and the original image is used. Note that this is only a rough approximation and not a good measure if the image contains a lot of variation or the segments are of very different size.

kernelcallable(), default exponential_kernel

Kernel function to weigh the samples based on the distances.

Operates on the distances and returns an array of the same shape: kernel(distances: np.ndarray) -> np.ndarray

Defaults to an exponential kernel with width .25 as in the original LIME implementation.

num_segments_to_selectint, optional

The number of segments to select. If not given, select all segments.

Returns:
list of ints

List of the indices of the selected segments. Segments are ordered by descending weight.

visualime.feature_selection.forward_selection(samples: ~numpy.ndarray, predictions: ~numpy.ndarray, label_idx: int, model_type: ~typing.Literal['linear_regression', 'lasso', 'ridge', 'bayesian_ridge', 'bayesian_ridge_fixed_lambda', 'bayesian_ridge_fixed_alpha_lambda'] = 'ridge', model_params: ~typing.Dict[str, ~typing.Any] | None = None, distances: ~numpy.ndarray | None = None, kernel: ~typing.Callable[[~numpy.ndarray], ~numpy.ndarray] = <function exponential_kernel>, num_segments_to_select: int | None = None) List[int][source]

Select num_segments_to_select through forward selection.

Parameters:
samplesnp.ndarray

The samples generated by visualime.lime.generate_samples(): An array of shape (num_of_samples, num_of_segments).

predictionsnp.ndarray

The predictions produced by visualime.lime.predict_images(): An array of shape (num_of_samples, num_of_classes).

label_idxint

The index of the label to explain in the output of predict_fn(). Can be the class predicted by the model, or a different class.

model_typestr

The type of linear model to fit. Available options are: “linear_regression”, “lasso”, “ridge”, “bayesian_ridge”, “bayesian_ridge_fixed_lambda”, and “bayesian_ridge_fixed_alpha_lambda”.

See the scikit-learn documentation for details on each of the models.

model_paramsdict, optional

Parameters to pass to the model during instantiation.

See the scikit-learn documentation for details on each of the models.

It is generally advisable to use the same model as for the final visualime.lime.weigh_segments() function.

distancesnp.ndarray, optional

The distances between the images and the original images used as sample weights when fitting the linear model.

If not given, the cosine distance between a sample and the original image is used. Note that this is only a rough approximation and not a good measure if the image contains a lot of variation or the segments are of very different size.

kernelcallable(), default exponential_kernel

Kernel function to weigh the samples based on the distances.

Operates on the distances and returns an array of the same shape: kernel(distances: np.ndarray) -> np.ndarray

Defaults to an exponential kernel with width .25 as in the original LIME implementation.

num_segments_to_selectint, optional

The number of segments to select. If not given, select all segments.

Returns:
list of ints

List of the indices of the selected segments. The segments are ordered as they were selected.

visualime.feature_selection.lars_selection(samples: ndarray, predictions: ndarray, label_idx: int, num_segments_to_select: int | None = None) List[int][source]

Select up to num_segments_to_select segments using the LARS path method.

Parameters:
samplesnp.ndarray

The samples generated by visualime.lime.generate_samples(): An array of shape (num_of_samples, num_of_segments).

predictionsnp.ndarray

The predictions produced by visualime.lime.predict_images(): An array of shape (num_of_samples, num_of_classes).

label_idxint

The index of the label to explain in the output of predict_fn(). Can be the class predicted by the model, or a different class.

num_segments_to_selectint, optional

The maximum number of segments to select. If not given, this value is set to the total number of segments.

Returns:
list of ints

List of the indices of the selected segments. The segment indices are in ascending order.