Tuesday, June 30, 2026

Spatially Variable Genes with Squidpy: Moran's I on Mouse Brain Visium Data

One of the first questions you ask after clustering a Visium dataset is: which genes actually vary across space? Not just between clusters — but continuously, across the tissue. That's what spatially variable gene (SVG) analysis answers.

In this post we run Moran's I via Squidpy on the 10x Genomics mouse brain Visium demo dataset and look at what comes out.

What is Moran's I?

Moran's I is a spatial autocorrelation statistic. For a given gene, it asks: are spots with high expression near other spots with high expression? A score near 1 means strong spatial clustering. Near 0 means random. Near -1 means a checkerboard pattern (rare in transcriptomics).

The formula is:

I = (N / W) * (Σᵢ Σⱼ wᵢⱼ(xᵢ - x̄)(xⱼ - x̄)) / Σᵢ(xᵢ - x̄)²

Where N is the number of spots, wᵢⱼ is the spatial weight between spots i and j (1 if neighbors, 0 otherwise), and x is normalized expression.

Squidpy wraps this with permutation-based p-values so you get significance alongside the score.

The setup

Dataset: visium_hne_adata() from Squidpy's built-in demos — 2,688 spots, 18,078 genes, mouse brain coronal section.

We filtered to the top 500 highly variable genes (HVGs) before running Moran's I. Running on all 18k genes is possible but slow; HVG filtering keeps the analysis focused on genes that vary meaningfully across the dataset.

import squidpy as sq
import scanpy as sc

adata = sc.read_h5ad("visium_hne_adata.h5ad")

# Filter to top 500 HVGs
sc.pp.highly_variable_genes(adata, n_top_genes=500, flavor="cell_ranger")
adata = adata[:, adata.var.highly_variable].copy()

# Normalize
sc.pp.normalize_total(adata, target_sum=1e4)
sc.pp.log1p(adata)

# Build spatial graph
sq.gr.spatial_neighbors(adata)

# Run Moran's I with permutation testing
if __name__ == '__main__':
    sq.gr.spatial_autocorr(adata, mode="moran", n_perms=200, n_jobs=1)

svg_results = adata.uns["moranI"]
top20 = svg_results.sort_values("I", ascending=False).head(20)

One note on if __name__ == '__main__': — this guard is required on Windows because Squidpy's permutation step uses multiprocessing. Without it you get a RuntimeError about the bootstrapping phase. On Linux/Mac it's not needed.

Results: top 20 SVGs

All 500 tested genes returned p-value = 0.0 after FDR correction (Benjamini-Hochberg). That's expected when the spatial signal is this strong — the permutation test rarely produces a null distribution that overlaps with the observed Moran's I. The ranking by I score is what matters here.

RankGeneMoran's IKnown function
1Itpka0.674IP3 kinase; enriched in hippocampus and cortical layers
2Fezf10.634Transcription factor; deep cortical layer identity (L5/L6)
3Baiap30.622Synaptic scaffolding; expressed in specific subcortical regions
4Shox20.611Thalamic and hypothalamic marker
5Slc30a30.600Zinc transporter; cortex-specific, enriched in excitatory neurons
6Gal0.590Galanin neuropeptide; subcortical and hypothalamic
7Arhgap360.590RhoGAP family; cerebellum and thalamus enriched
8Foxg10.587Forebrain development TF; restricted to cortex and hippocampus
9Tbr10.584Layer 6 cortical excitatory neuron marker
10Icam50.580Telencephalin; forebrain-restricted adhesion molecule
11Kcnh30.564Voltage-gated K+ channel; cortical expression
12Nov0.561CCN3; expressed in specific cortical and subcortical layers
13Rprml0.555Reprimo-like; limited expression data, spatially restricted here
14Crym0.544mu-crystallin; striatum and layer 6 cortex marker
15Epop0.543Elongin BC and Polycomb repressive complex protein
16Kcnj40.543Inward-rectifier K+ channel; subcortical enriched
17AW5519840.540Uncharacterized locus; strong spatial signal here
18Otp0.527Orthopedia homeobox; hypothalamic nuclei marker
19Igfbp60.525IGF binding protein; meningeal and vascular enriched
20Lamp50.523Interneuron subtype marker; layer 1-2 cortex

What the results tell us

The top SVGs split into recognizable categories:

Cortical layer markers — Fezf1, Tbr1, Foxg1, Slc30a3, Crym, Lamp5. These genes define cortical laminar identity. Moran's I picks them up because cortical layers are spatially coherent bands — spots in layer 6 cluster together, and Tbr1 expression clusters with them.

Subcortical region markers — Shox2, Gal, Otp, Arhgap36. These are thalamic or hypothalamic markers. A coronal section at a mid-brain level captures both cortex and subcortical structures, and genes that are on/off between these compartments will score high on spatial autocorrelation.

Synaptic/ion channel genes — Itpka, Kcnh3, Kcnj4, Icam5. These are expressed in specific neuron subtypes that cluster spatially. Itpka topping the list is consistent with its known hippocampal enrichment — the hippocampus occupies a spatially coherent arc in these sections.

The uncharacterized locus AW551984 at rank 17 is interesting. A high Moran's I on a poorly annotated gene is a reasonable starting point for further investigation — it's spatially structured, which at minimum means it's not noise.

A note on HVG pre-filtering

We filtered to 500 HVGs before running Moran's I. This is a practical choice, not a methodological requirement. Running on all genes would likely surface similar top hits but would take proportionally longer. The tradeoff: genes that are spatially variable but not highly variable (e.g., lowly expressed region-specific markers) could be missed. For a complete SVG screen, running on all expressed genes with a minimum count filter is more thorough.

What's next

The natural follow-up is to visualize where these top SVGs are expressed on the tissue — Squidpy's sq.pl.spatial_scatter does this in a few lines. That'll be the next post.

The full code for this analysis is at github.com/lociven/spatiabio-tutorials.

Get the complete pack

Squidpy Complete Analysis Pack — 10 Notebooks

All 10 notebooks from this series in one download — SVGs, neighborhood enrichment, co-occurrence, ligand-receptor, and the complete pipeline. Verified and ready to run on your own data.

Get it for $19 →

No comments:

Post a Comment

I Gave Claude Science One Prompt. It Ran a Full Spatial Analysis.

By Lociven · SpatiaBio · July 2, 2026 I gave Claude Science AI Workbench — Anthropic's new scientific analysis platform — a single...