pilotpy.tl.wasserstein_distance
- pilotpy.tl.wasserstein_distance(adata, emb_matrix='X_PCA', clusters_col='cell_types', sample_col='sampleID', status='status', metric='cosine', regulizer=0.2, normalization=True, regularized='unreg', reg=0.1, res=0.01, steper=0.01, data_type='scRNA', return_sil_ari=False)
Calculate the Wasserstein (W) distance among samples using PCA representation and clustering information.
Parameters
- adataAnnData
Loaded AnnData object containing the data.
- emb_matrixnumpy.ndarray
PCA representation of data (variable).
- clusters_colstr
Column name in the observation level of ‘adata’ that represents cell types or clustering.
- sample_colstr
Column name in the observation level of ‘adata’ that represents samples or patients.
- statusstr
Column name in the observation level of ‘adata’ that represents status or disease, e.g., control/case.
- regulizerfloat, optional
Hyper-parameter of a Dirichlet distribution for regularization, by default 0.1.
- metricstr, optional
Metric for calculating the cost matrix, by default ‘cosine’.
- regularizedbool, optional
Whether to use regularized optimal transport, by default True.
- regfloat, optional
Regularization parameter if ‘regularized’ is True, by default 0.1.
- resfloat, optional
Resolution for Leiden clustering to achieve desired cluster count, by default 0.1.
- steperfloat, optional
Stepper value for finding the best Leiden resolution, by default 0.01.
- data_typestr, optional
Type of your data, e.g., ‘scRNA’ or ‘pathomics’, by default ‘scRNA’.
- return_sil_aribool, optional
Whether to return ARI (Adjusted Rand Index) or Silhouette score for assessing W distance effects, by default False.
Returns
- None
Calculates and stores the W distance among samples in the adata object.