pyclustree

Attributes

Functions

clustree(adata, cluster_keys[, title, ...])

Create a hierarchical clustering tree visualization to compare different clustering resolutions.

Package Contents

pyclustree.clustree(adata, cluster_keys, title=None, scatter_reference=None, node_colormap='tab20', node_color_gene=None, node_color_gene_use_raw=True, node_color_gene_transformer=None, node_size_range=(100, 1000), edge_width_range=(0.5, 5.0), edge_weight_threshold=0.0, x_spacing=2.5, y_spacing=1.25, order_clusters=True, show_colorbar=False, show_fraction=False, show_cluster_keys=True, graph_plot_kwargs=None)

Create a hierarchical clustering tree visualization to compare different clustering resolutions.

import scanpy as sc
from pyclustree import clustree

adata = sc.datasets.pbmc3k_processed()

# Run leiden clustering for different resolutions
for resolution in [0.2, 0.4, 0.6, 0.8, 1.0]:
    sc.tl.leiden(
        adata,
        resolution=resolution,
        flavor="igraph",
        n_iterations=2,
        key_added=f"leiden_{str(resolution).replace('.', '_')}",
    )

# Create a clustree visualization
fig = clustree(
    adata,
    [f"leiden_{str(resolution).replace('.', '_')}" for resolution in [0.2, 0.4, 0.6, 0.8, 1.0]],
)
Parameters:
  • adata (AnnData) – The AnnData object from scanpy or any other library.

  • cluster_keys (list[str]) – The list of cluster keys to visualize, in hierarchical order. Keys should be present in adata.obs.

  • title (str, optional) – The title of the plot. Defaults to None.

  • scatter_reference (str, optional) – The key in adata.obsm to use as a reference for the scatter plot. If None, the nodes will be placed in a hierarchical tree. Defaults to None.

  • node_colormap (Union[Colormap, str], optional) – The colormap to use for coloring the nodes. If a list is provided, the first colormap will be used for the first clustering, the second colormap for the second clustering, and so on. For each clustering, the colors will be scaled based on the number of clusters. Defaults to “tab20”.

  • node_color_gene (str, optional) – The gene to use for coloring the nodes. If provided, node colors will be based on the expression of this gene. If None, node colors will be based on the cluster key/level. Defaults to None.

  • node_color_gene_use_raw (bool, optional) – Whether to use the raw data for the gene expression if available. Defaults to True.

  • node_color_gene_transformer (Optional[callable], optional) – A function to transform the gene expression values to a single value for coloring the nodes. If None, the mean expression of the gene will be used. Defaults to None.

  • node_size_range (tuple[float, float], optional) – The range of node sizes to use. Defaults to (100, 1000).

  • edge_width_range (tuple[float, float], optional) – The range of edge widths to use. Defaults to (0.5, 5.0).

  • edge_weight_threshold (float, optional) – The threshold for edge weights to include in the visualization. Defaults to 0.0.

  • x_spacing (float, optional) – The horizontal spacing between nodes. Defaults to 2.5.

  • y_spacing (float, optional) – The vertical spacing between nodes. Defaults to 1.25.

  • order_clusters (bool, optional) – Whether to order the clusters based on the transition matrix. Defaults to True.

  • show_colorbar (bool, optional) – Whether to show the colorbar. Defaults to False.

  • show_fraction (bool, optional) – Whether to show the fraction of cells from the parent cluster that transitioned to the child cluster. Defaults to False.

  • show_cluster_keys (bool, optional) – Whether to show the cluster keys on the left side of the plot. Defaults to True.

  • graph_plot_kwargs (Optional[dict], optional) – Additional keyword arguments to pass to nx.draw. Will override the default arguments. Defaults to None.

Returns:

The matplotlib figure object of the clustree visualization.

Return type:

plt.Figure

pyclustree.pyclustree