multispati_pca

class multispaeti.multispati_pca(X, n_components=None, *, connectivity=None, center_sparse=False, use_gpu=False, random_state=None)

Calculate MULTISPATI-PCA and return the transformed data matrix and components.

For more information refer to multispaeti.MultispatiPCA.

This function is more efficient than multispaeti.MultispatiPCA.fit_transform() if the additional attributes are not needed.

Parameters:
  • X (ndarray or csr_array or csc_array) – Array of observations x features.

  • n_components (int or tuple[int, int], optional) – Number of components to keep. If None, will keep all components. If an int, it will keep the top n_components. If a tuple, it will keep the top and bottom n_components, respectively.

  • connectivity (sparray or spmatrix) – Matrix of row-wise neighbor definitions i.e. cij is the connectivity of i \(\to\) j. The matrix does not have to be symmetric. It can be a binary adjacency matrix or a matrix of connectivities in which case cij should be larger if i and j are close. A distance matrix should be transformed to connectivities by e.g. calculating \(1-d/d_{max}\) beforehand.

  • center_sparse (bool) – Whether to center X if it is a sparse array. By default sparse X will not be centered as this requires transforming it to a dense array, potentially raising out-of-memory errors.

  • use_gpu (bool) – Whether to use GPU implementation based on cupy and cupyx.scipy (not installed by default). Eigendecomposition using the GPU is not as mature yet (especially for sparse arrays).

  • random_state (int | RandomState | None) – Used when the X is sparse and center_sparse is False. Pass an int for reproducible results across multiple function calls.

Returns:

  • X_transformed (numpy.ndarray or cupy.ndarray)

  • components (numpy.ndarray or cupy.ndarray)