MultispatiPCA
- class multispaeti.MultispatiPCA(n_components=None, *, connectivity=None, center_sparse=False, use_gpu=False, random_state=None)
MULTISPATI-PCA
In contrast to Principal component analysis (PCA), MULTISPATI-PCA does not optimize the variance explained of each component but rather the product of the variance and Moran’s I. This can lead to negative eigenvalues i.e. in the case of negative auto-correlation.
The problem is solved by diagonalizing the symmetric matrix \(H=1/(2n)*X^t(W+W^t)X\) where X is matrix of n observations \(\times\) d features, and W is a matrix of the connectivity between observations.
- Parameters:
n_components (int or tuple[int, int], optional) – Number of components to keep. If None, will keep all components. If an int, it will keep the top n_components. If a tuple, it will keep the top and bottom n_components, respectively.
connectivity (sparray or spmatrix) – Matrix of row-wise neighbor definitions i.e. cij is the connectivity of i \(\to\) j. The matrix does not have to be symmetric. It can be a binary adjacency matrix or a matrix of connectivities in which case cij should be larger if i and j are close. A distance matrix should be transformed to connectivities by e.g. calculating \(1-d/d_{max}\) beforehand.
center_sparse (bool) – Whether to center X if it is a sparse array. By default sparse X will not be centered as this requires transforming it to a dense array, potentially raising out-of-memory errors.
use_gpu (bool) – Whether to use GPU implementation based on cupy and cupyx.scipy (not installed by default). Eigendecomposition using the GPU is not as mature yet. For sparse arrays instead of min(n, d)-1 only min(n, d)-3 eigenvalues/-vectors can be calculated (which in most cases won’t be a problem). For dense arrays all eigenvalues have to be calculated and subsequently subsetted.
random_state (int | RandomState | None) – Used when the X is sparse and center_sparse is False and for Moran’s I bound estimation. Pass an int for reproducible results across multiple function calls.
- components_
The estimated components: Array of shape (n_components, n_features).
- eigenvalues_
The eigenvalues corresponding to each of the selected components. Array of shape (n_components,).
- Type:
- variance_
The estimated variance part of the eigenvalues. Array of shape (n_components,).
- Type:
- moransI_
The estimated Moran’s I part of the eigenvalues. Array of shape (n_components,).
- Type:
- mean_
Per-feature empirical mean, estimated from the training set if X is not sparse. Array of shape (n_features,).
References
Methods
Fit MULTISPATI-PCA projection.
Fit and transform the data using MULTISPATI-PCA projection.
Get output feature names for transformation.
Get metadata routing of this object.
Get parameters for this estimator.
Calculate the minimum and maximum bound for Moran's I given the connectivity and the expected value given the #observations.
Set output container.
Set the parameters of this estimator.
Transform the data using fitted MULTISPATI-PCA projection.
Transform the data using fitted MULTISPATI-PCA projection and calculate the spatial lag.