AI Teaches Itself to See What Matters in Scientific Images

Studying proteins can mean dealing with thousands of molecules, each seen from a different angle. It’s similar to analyzing galaxy images where every object is oriented differently.

Scientists analyzing images in fields from molecular biology to astronomy face this same frustrating challenge: the meaningful features of objects remain constant, but their arbitrary positions and orientations obscure the patterns. A protein's structure doesn't change when it rotates. A galaxy's spiral arms remain the same whether it appears at the top or bottom of a frame.

Until now, extracting these core features has required either labor-intensive manual alignment or computationally expensive algorithms that bog down under the weight of massive datasets. In cryo-electron microscopy alone, a single experimental session can generate thousands of images, each containing hundreds of randomly oriented particle projections.

Teaching Machines to Look Past the Noise

A team of researchers has developed CODAE—short for centroid- and orientation-aware disentangling autoencoder—a neural network that learns to separate what matters from where things happen to be. The system teaches itself to identify an object's position and rotation, then learns features independent of those attributes.

The architecture combines translation- and rotation-equivariant encoding with what the researchers call "image moments"—mathematical descriptions that capture an object's center of mass and orientation. Think of image moments as the computational equivalent of balancing a shape on your fingertip to find its center, then noting which way it's pointing.

CODAE splits its learning into two branches. One extracts position and orientation. The other learns everything else: shape, size, brightness, structural variations. By processing these separately, the network can reconstruct images aligned in the same orientation while simultaneously reproducing the exact view presented in the original input.

Three Domains, One Solution

The researchers tested CODAE on datasets spanning three scientific fields, each with distinct challenges.

In life sciences, they applied it to simulated cryo-electron microscopy images of GroEL protein particles. These projections appear at low signal-to-noise ratios with unknown three-dimensional orientations. CODAE successfully learned not just the particles' positions and in-plane rotations, but also their out-of-plane orientations—essentially different viewing angles of the same three-dimensional molecule. This capability could streamline the alignment process required for high-resolution three-dimensional reconstruction.

Material science presented a different test case: experimental graphene convergent beam electron diffraction patterns from four-dimensional scanning transmission electron microscopy. Here, position learning wasn't necessary—the patterns arrive pre-centered—but rotational alignment remained crucial. CODAE identified orientation alongside features correlated with the center of mass components and magnitude, revealing information beyond what traditional center-of-mass methods capture.

The astronomy dataset proved most visually striking. Working with 314,000 galaxy images from the Galaxy Zoo project, CODAE learned to distinguish size, color, shape, separation between multiple galaxies in a frame, and background characteristics. The system handles cases with two galaxies in a single image more reliably than competing approaches.

Speed and Precision

Computational efficiency separates CODAE from earlier methods. Where previous approaches apply spatial transformations within the latent space—requiring expensive grid sampling at every step—CODAE applies transformations only to the decoder's output. This seemingly small architectural change yields dramatic results: training runs at least ten times faster than comparable models, with inference eight times quicker.

The researchers compared CODAE against five other models using synthetic datasets with known ground-truth labels, measuring eight different metrics that quantify how well a system learns separate, independent features. CODAE outperformed all competitors across both test datasets, achieving what the field calls "state-of-the-art" disentanglement scores.

On the XYRCS synthetic dataset—containing circles, triangles, and rectangles with varying positions, orientations, and brightness—CODAE perfectly aligned all objects. Target-VAE performed comparably when all features were considered, but struggled with certain shapes. Spatial-VAE matched CODAE's performance on color and shape features specifically. Other models failed to maintain alignment consistency.

The dSprites dataset, with its squares, ellipses, and hearts, revealed sharper performance gaps. While CODAE aligned images while preserving scale and shape, competing models stumbled on heart-shaped objects. Only target-VAE managed partial success.

When Details Disappear

The system isn't perfect. Testing on real galaxy images revealed a smoothing effect that sacrifices fine structural detail. Complex features get simplified in the reconstruction process.

But this apparent limitation might offer unexpected benefits. For certain astronomical analyses, approximating symmetric light profiles matters more than preserving every structural intricacy. The smoothing could actually help in those contexts.

The researchers acknowledge another constraint: CODAE currently handles only translation and rotation. Extending the method to include depth and scale variations would broaden its applicability across more scientific domains.

Looking Forward

The Vera Rubin Observatory's upcoming Legacy Survey of Space and Time will generate photometric images of at least ten billion galaxies. Material science facilities equipped with high-speed electron detectors already collect four-dimensional scanning transmission electron microscopy datasets containing tens of thousands of diffraction patterns per session. Cryo-electron microscopy continues producing massive volumes of projection images.

CODAE offers a path toward automated analysis of these datasets. Beyond reconstruction quality, the approach enables quantitative measurements scientists care about: in-plane orientation in cryo-EM experiments, distances between galaxies, crystalline orientation in materials.

The team tested their approach on publicly available data to ensure reproducibility. They released their code under an open-source license, inviting other researchers to adapt the method to new domains.

What makes this work notable isn't just technical performance. It's the demonstration that a single architectural approach can learn meaningful representations across radically different scientific imaging contexts—proteins, crystals, and galaxies—without domain-specific tweaking. The same principles that help align molecules also organize astronomical observations.

Whether CODAE proves transformative or becomes one tool among many depends on how it performs under the messy conditions of real-world science. Noisy data, incomplete information, edge cases that break assumptions. Those tests await.

For now, the method exists. It works. And it suggests that teaching machines to ignore irrelevant variation might be simpler than we thought.

Credit & Disclaimer: This article is a popular science summary written to make peer-reviewed research accessible to a broad audience. All scientific facts, findings, and conclusions presented here are drawn directly and accurately from the original research paper. Readers are strongly encouraged to consult the full research article for complete data, methodologies, and scientific detail. The article can be accessed through https://doi.org/10.1038/s42256-024-00978-5

Latest Jobs

AI Teaches Itself to See What Matters in Scientific Images

AI Teaches Itself to See What Matters in Scientific Images

AI Teaches Itself to See What Matters in Scientific Images

Teaching Machines to Look Past the Noise

Three Domains, One Solution

Speed and Precision

When Details Disappear

Looking Forward

Get insights bi-weekly

More from Intelligent Systems and Computing Desk

How Machine Learning Is Transforming Soccer Training Into Match Day Gold

Share this research

About the Author

Intelligent Systems and Computing Desk

How Movement and Attention Could Make Virtual Reality Dramatically Cheaper to Run

Why Building Computers for Space Is Harder Than You Think

Continue exploring

How Computer Vision Solves a Fundamental Puzzle: Which Photos Can Actually Be Turned Into 3D Models

AI Learns to Watch Videos Like Humans Play Games

Smart Cameras That Understand What They See

How AI Predicts Scientific Breakthroughs Before Scientists Do