ICS

Module containing the main Invariant Coordinate Selection (ICS) Class and associated methods.

The ICS class provides methods to fit the ICS model from the data, transform data using the model, and provide a detailed summary of the results. This module relies on scatter matrices (defined in the Scatter page). The ICS class supports three different algorithms for applying ICS to data: (‘standard’, ‘whiten’, and ‘QR’), which can be specified as parameters during instantiation. Additional options such as the choice of scatter matrices, centering the data, and fixing the signs can also be defined.

This implementation is based on the function ICS-S3 from the R package ICS. For more details about the supported algorithms and ‘fix_signs’ argument, see the R package documentation (function ICS-S3).

class icspylab.ics.ICS(S1=<function cov>, S2=<function covW>, algorithm='whiten', center=False, fix_signs='scores', S1_args={}, S2_args={})[source]

Bases: object

Invariant Coordinate Selection (ICS) Class and associated methods.

This class implements the ICS algorithm: it transforms the data, via the simultaneous diagonalization of two scatter matrices, into an invariant coordinate system or independent components, depending on the underlying assumptions. It supports various scatter matrix calculations and offers multiple algorithms for applying ICS.

Parameters:
  • S1 (function returning a scatter object) – (default: cov) Function to compute the first scatter matrix.

  • S2 (function returning a scatter object) – (default: covW) Function to compute the second scatter matrix.

  • algorithm (str) – (default: ‘whiten’) The algorithm used for transformation (‘standard’, ‘whiten’, ‘QR’).

  • center (bool) – (default: False): a logical indicating whether the invariant coordinates should be centered with respect to the first locattion or not. Centering is only applicable if the first scatter object contains a location component, otherwise this is set to False. Note that this only affects the scores of the invariant components (attribute scores_), but not the generalized kurtosis values (attribute kurtosis_).

  • fix_signs (str) – (default: ‘scores’) How to fix the signs of the invariant coordinates. Possible values are ‘scores’ to fix the signs based on (generalized) skewness values of the coordinates, or ‘W’ to fix the signs based on the coefficient matrix of the linear transformation.

  • S1_args (dict) – Additional arguments for S1.

  • S2_args (dict) – Additional arguments for S2.

W_

Transformation matrix in which each row contains the coefficients of the linear transformation to the corresponding invariant coordinate.

Type:

np.ndarray

scores_

Transformed matrix in which each column contains the scores of the corresponding invariant coordinate.

Type:

np.ndarray

kurtosis_

Generalized kurtosis values.

Type:

np.ndarray

skewness_

Skewness values.

Type:

np.ndarray

feature_names_in_

Names of features seen during fit. Defined only when X has feature names that are all strings.

Type:

np.ndarray

S1_X_

Fitted scatter S1. Defined only when center=True.

Type:

np.ndarray

Supported algorithms:
  1. standard: performs the spectral decomposition of the symmetric matrix \(S_1(X)^{-1/2}S_2(X)S_1(X)^{-1/2}\)

  2. whiten: whitens the data with respect to the first scatter matrix before computing the second scatter matrix.

  3. QR: numerically stable algorithm based on the QR algorithm for a common family of scatter pairs: if S1 is cov(), and if S2 is one of cov4, covW, or covAxis. See Archimbaud et al. (2023) for details.

Examples

>>> from sklearn.datasets import load_iris
>>> from icspylab import ICS
>>> iris = load_iris()
>>> X = iris.data
>>> ICS = ICS()
>>> ICS.fit(X)
describe()[source]

Print a summary of the ICS model.

This includes the algorithm used, whether data was centered, how signs were fixed; and displays the generalized kurtosis, transformation matrix, transformed data, and the skewness of the data.

fit(X)[source]

Fit the ICS model to the data.

This function relies on several helper methods to perform the ICS transformation: _validate_input, _compute_first_scatter, _compute_second_scatter, _transform_second_scatter, _compute_transformation, _compute_transformation_qr, _center_data, _fix_component_signs.

Parameters:

X (array-like) – Data to fit the ICS model, where rows are samples and columns are features.

Returns:

The fitted ICS object.

Return type:

self

fit_transform(X)[source]

Fit the ICS model and transform the data using the fitted ICS model.

Parameters:

X (array-like) – Data to fit and transform.

Returns:

Transformed matrix in which each column contains the scores of the corresponding invariant coordinate.

Return type:

np.ndarray

plot(**kwargs)[source]

Plot the transformed data using the fitted ICS model.

plot_kurtosis(**kwargs)[source]

Plot the generated kurtosis.

transform(X)[source]

Transform the data using the fitted ICS model.

Parameters:

X (array-like) – Data to transform.

Returns:

Transformed matrix in which each column contains the scores of the corresponding invariant coordinate.

Return type:

np.ndarray

References

  • Archimbaud, A., Drmac, Z., Nordhausen, K., Radojcic, U. and Ruiz-Gazen, A. (2023) Numerical Considerations and a New Implementation for Invariant Coordinate Selection. SIAM Journal on Mathematics of Data Science, 5(1), 97–121. doi:10.1137/22M1498759.

  • Tyler, D.E., Critchley, F., Duembgen, L. and Oja, H. (2009) Invariant Co-ordinate Selection. Journal of the Royal Statistical Society, Series B, 71(3), 549–592. doi:10.1111/j.14679868.2009.00706.x.