Tuesday, June 30, 2026

CellChat on Spatial Data: A Step-by-Step Tutorial (With Real Errors Included)

If you've already got a clustered Visium object from Seurat (like the one we built in our Squidpy vs Seurat comparison), the natural next question is: which cell types are actually talking to each other, and does that change with physical distance? That's exactly what CellChat's spatial mode answers — and unlike our earlier posts, this one comes with three real errors we hit along the way, not just the happy path.

What CellChat Does

CellChat infers, analyzes, and visualizes cell-cell communication networks from single-cell or spatial transcriptomics data, using a curated ligand-receptor interaction database. The spatial mode adds physical distance as a constraint — two cell types might both express a matching ligand-receptor pair, but if they're never actually near each other in the tissue, CellChat won't call that a likely real interaction.

We ran this on the exact same dataset from our Seurat post — the official stxBrain mouse brain Visium demo (2,696 spots, 15 clusters) — so this is a direct continuation, not a new dataset.

The Real Pipeline (With Real Errors)

library(Seurat)
library(SeuratData)
library(CellChat)

brain <- LoadData("stxBrain", type = "anterior1")
brain <- SCTransform(brain, assay = "Spatial", verbose = FALSE)
brain <- RunPCA(brain, assay = "SCT", verbose = FALSE)
brain <- FindNeighbors(brain, reduction = "pca", dims = 1:30, verbose = FALSE)
brain <- FindClusters(brain, verbose = FALSE)

data.input <- GetAssayData(brain, layer = "data", assay = "SCT")
meta <- data.frame(labels = factor(paste0("C", Idents(brain))), row.names = names(Idents(brain)))
spatial.locs <- as.matrix(GetTissueCoordinates(brain)[, c("x","y")])

cellchat <- createCellChat(object = data.input, meta = meta, group.by = "labels",
                            datatype = "spatial", coordinates = spatial.locs,
                            spatial.factors = data.frame(ratio = 1, tol = 5))
cellchat@DB <- CellChatDB.mouse
cellchat <- subsetData(cellchat)
cellchat <- identifyOverExpressedGenes(cellchat)
cellchat <- identifyOverExpressedInteractions(cellchat)

cellchat <- computeCommunProb(cellchat, type = "truncatedMean", trim = 0.1,
                               distance.use = TRUE, interaction.range = 250,
                               scale.distance = 0.01, contact.dependent = TRUE, contact.range = 100)
cellchat <- filterCommunication(cellchat, min.cells = 10)
cellchat <- aggregateNet(cellchat)

This looks clean here because we already fixed it. The first time through, we hit three separate real errors:

Error 1: GetAssayData() with slot is defunct

Using slot = "data" (which is what most older tutorials still show) throws a hard error in current SeuratObject — it was deprecated in 5.0.0 and is now fully removed. Fix: use layer = "data" instead. If you're following any Seurat tutorial written before 2024, check for this.

Error 2: CellChat rejects cluster label "0"

Error in setIdent(object, ident.use = group.by) :
  Cell labels cannot contain `0`!

Seurat's default cluster identities are numeric starting at 0 — completely normal and fine for Seurat itself. But CellChat's internal identity handling breaks if any cluster is literally named "0". Fix: prefix cluster labels with a letter before handing them to CellChat, e.g. factor(paste0("C", Idents(brain))) turns cluster 0 into "C0". Easy to miss because the error message doesn't make the cause obvious.

Error 3: missing presto dependency

CellChat's identifyOverExpressedGenes() wants the presto package for a faster Wilcoxon test implementation and throws an error (not just a warning) if it's missing — unlike many R packages that silently fall back. Fix: devtools::install_github('immunogenomics/presto') before running CellChat, or pass do.fast = FALSE if you don't want the extra dependency.

Real Results

On this dataset, the full pipeline (preprocessing + spatial-distance-constrained communication probability calculation) took ~360 seconds (6 minutes) on a standard machine — almost all of it in computeCommunProb(), which is the step that actually factors in spatial distance between every pair of spots.

  • 13,110 significant ligand-receptor interactions identified across the 15 spatial clusters
  • Top signaling pathways by interaction count: Glutamate (4,787), GABA-A (1,361), LAMININ (874), WNT (538), COLLAGEN (489), GABA-B (396)
  • This makes biological sense without any cherry-picking: glutamate and GABA are the two dominant excitatory/inhibitory neurotransmitter systems in brain tissue, so seeing them as the top communication signals in a mouse brain section is exactly what you'd expect from a sane result, not a red flag
CellChat circle plot showing cell-cell communication strength between all 15 clusters in mouse brain Visium tissue

Aggregated communication network across all signaling pathways — edge thickness reflects interaction strength between cluster pairs. Real output from the pipeline above.

CellChat circle plot showing Glutamate signaling specifically between clusters

Glutamate signaling specifically — the single largest signaling category in this dataset by interaction count.

Why the Spatial Constraint Matters

Without distance.use = TRUE, CellChat would just look at co-expression across the whole dataset, the same way it would for dissociated single-cell data — any cluster pair with matching ligand-receptor expression gets flagged, regardless of whether they're ever physically close in the tissue. With the spatial constraint on, interactions only count if the cell types are within the specified interaction.range (250 units here) or contact.range (100 units, for direct contact-dependent signaling). For a tissue like brain where spatial organization is functionally meaningful, this is the difference between a biologically plausible result and a list of statistically-matched-but-physically-impossible interactions.

Bottom Line

CellChat's spatial mode works, and the three errors above are exactly the kind of thing that wastes an afternoon if you don't know they're coming. If you're already running Seurat on Visium data, adding CellChat on top is a relatively small additional step — the real cost is the ~6 minute runtime for computeCommunProb() on a dataset this size, which will scale up on larger tissue sections.


Last updated: July 2026. Tested with CellChat (jinworks/CellChat, GitHub HEAD), Seurat 5.5.1, R 4.6.1, SeuratData stxBrain 0.1.2.

Get the complete pack

Squidpy Complete Analysis Pack — 10 Notebooks

All 10 notebooks from this series in one download — SVGs, neighborhood enrichment, co-occurrence, ligand-receptor, and the complete pipeline. Verified and ready to run on your own data.

Get it for $19 →

No comments:

Post a Comment

I Gave Claude Science One Prompt. It Ran a Full Spatial Analysis.

By Lociven · SpatiaBio · July 2, 2026 I gave Claude Science AI Workbench — Anthropic's new scientific analysis platform — a single...