pilotpy.tl.compute_diff_expressions
- pilotpy.tl.compute_diff_expressions(adata, cell_type: str = None, proportions: DataFrame = None, selected_genes: list = None, font_size: int = 18, group1: str = 'Tumor 1', group2: str = 'Tumor 2', label_name: str = 'Predicted_Labels', fc_thr: float = 0.5, pval_thr: float = 0.01, sample_col: str = 'sampleID', col_cell: str = 'cell_types', path=None, normalization=False, n_top_genes=2000, highly_variable_genes_=True, number_n=5, number_p=5, marker='o', color='w', markersize=8, font_weight_legend='normal', size_legend=12, figsize=(15, 15), dpi=100)
Using limma R package, lmFit fits a linear model using weighted least squares for each gene. Comparisons between groups (log fold-changes) are obtained as contrasts of these fitted linear models. Empirical Bayes smoothing of standard errors (shrinks standard errors that are much larger or smaller than those from other genes towards the average standard error).
Parameters
- adataAnnData
Annotated data matrix.
- cell_typestr, optional
Specify cell type name to check its differential expression genes. The default is None.
- proportionspd.DataFrame, optional
Cell types proportions in each sample. The default is None.
- selected_geneslist, optional
Specify gene names to be considered for checking their differentiation.
- font_sizeint, optional
Font size for plot labels and legends. The default is 18.
- group1str, optional
Name of the first patient sub-group for comparison. The default is ‘Tumor 1’.
- group2str, optional
Name of the second patient sub-group for comparison. The default is ‘Tumor 2’.
- label_namestr, optional
Name of the column containing the labels of patient sub-groups. The default is ‘Predicted_Labels’.
- fc_thrfloat, optional
Specify the fold change threshold. The default is 0.5.
- pval_thrfloat, optional
Specify the adjusted p-value threshold. The default is 0.01.
- sample_colstr, optional
Name of the column containing sample IDs. The default is ‘sampleID’.
- col_cellstr, optional
Name of the column containing cell type annotations. The default is ‘cell_types’.
- pathstr, optional
Path to save the results. The default is None.
- normalizationbool, optional
Perform gene expression normalization. The default is False.
- n_top_genesint, optional
Number of top variable genes to consider. The default is 2000.
- highly_variable_genes_bool, optional
Determine highly variable genes. The default is True.
- number_nint, optional
The number of labels that the user wants to show over the plot for negative thresholds. The default is 5.
- number_pint, optional
The number of labels that the user wants to show over the plot for positive thresholds. The default is 5.
- markerstr, optional
Marker style for the labels in the volcano plot. The default is ‘o’.
- colorstr, optional
Marker color for the labels in the volcano plot. The default is ‘w’.
- markersizeint, optional
Marker size for the labels in the volcano plot. The default is 8.
- font_weight_legendstr, optional
Font weight for legend labels. The default is ‘normal’.
- size_legendint, optional
Font size for legend labels. The default is 12.
- figsize: tuple, optional
Figure size. The default is (15,15).
- dpiint, optional
Dots per inch for the saved plot image. Default is 100.
Returns
None
Generates and displays a volcano plot of fold changes between two interested patient sub-groups. Saves a statistical table of each gene. Saves significantly differentiated genes in each group.