Phenocoder¶
A machine-learning framework that combines conditional variational autoencoders with spatial graph analysis to learn unsupervised phenotypic embeddings of complex tissue architectures from microscopy images.
Phenocoder learns compressed morphological representations of cells/nuclei directly from
microscopy images using (conditional) convolutional variational autoencoders, then analyzes
how those representations are organized in space. It is built around the
SpatialData ecosystem: images, segmentation labels and
per-object tables live in a single SpatialData object, and the Phenocoder
class drives the full workflow on top of it.
The workflow at a glance¶
generate_dataset()— extract image patches centered on each segmented object and write them (plus per-channel intensity statistics) to disk.initialize_model()— build aCVAEor conditionalCondCVAEand the train/validation data generators.train()— fit the model with early stopping, learning-rate scheduling and TensorBoard logging.encode()— embed every object into the learned latent space, optionally smoothing the latents over each object’s spatial neighborhood (message passing).spatialgraph_stats()— compute spatial neighborhood-graph statistics per sample (or per spatial subunit) from clustered latents.spatialgraph_embedding()— embed the per-sample/per-subunit statistics (PCA + UMAP, with optional batch correction) for sample-level comparison.
Features¶
Convolutional VAE (
CVAE): encode multi-channel image patches into a compact latent space.Conditional VAE (
CondCVAE): condition the encoder/decoder on one or more metadata columns (e.g. dataset, z-slice, donor) via one-hot encoding.SpatialData-native: works directly on
SpatialDataimages, labels and tables.Flexible patch extraction: configurable patch size, 2D or per-z-slice 3D sampling, and global or per-sample intensity normalization.
Spatial message passing: aggregate latents over a physical-distance neighborhood graph.
Spatial graph analysis: interaction matrices, Moran’s I, centrality, connectivity and convex-hull statistics at sample or subunit resolution.
Beta-VAE support: tune the KL weight (
beta) for more disentangled representations.Built on Keras 3 with the TensorFlow backend.
Getting started