Even though every cell in your body has the same DNA, different types of cells—like brain cells and skin cells—use different parts of that DNA. This is because DNA folds into specific 3D shapes, which control which genes are turned on or off in each cell.
Scientists at MIT have developed a new way to figure out these 3D DNA structures using artificial intelligence (AI). Their method, powered by a type of AI called generative AI, can predict thousands of DNA structures in just minutes—much faster than current lab techniques. This could help researchers better understand how DNA folding affects gene activity in different types of cells.
“Our goal was to predict the 3D structure of DNA just from its sequence,” says MIT chemistry professor Bin Zhang, who led the study. “Now that we can do that as well as advanced lab techniques, it opens up exciting new possibilities.”
The study, published in Science Advances, was led by MIT graduate students Greg Schuette and Zhuohan Lao.
DNA Folding: How Cells Fit 2 Meters of DNA Into a Tiny Nucleus
Inside each cell, DNA wraps around proteins called histones, forming a structure that looks like beads on a string. This packaging allows nearly 2 meters of DNA to fit inside a nucleus that’s only about 0.01 millimeters in diameter!
Special chemical tags, known as epigenetic modifications, attach to DNA and influence how it folds. These tags vary between cell types and play a big role in determining which genes are active.
For the past 20 years, scientists have used lab techniques like Hi-C to study DNA folding. Hi-C works by linking pieces of DNA that are close together, then cutting the DNA into small parts and sequencing it. This method helps scientists map out which DNA segments are near each other in 3D space. However, these experiments take a lot of time—about a week to analyze just one cell.
AI Makes DNA Structure Predictions Faster and Easier
To speed up the process, Zhang and his students developed an AI-based tool called ChromoGen. This tool uses deep learning, a type of AI that recognizes patterns, to analyze DNA sequences and predict their 3D structures in a cell.
“Deep learning is great at finding patterns in large amounts of data,” Zhang explains. “It helps us understand what’s important in the DNA sequence and how it influences 3D folding.”
ChromoGen has two main parts:
- A deep learning model that reads the DNA sequence and predicts how accessible different regions of the DNA are.
- A generative AI model that predicts realistic 3D shapes of chromatin (the complex of DNA and proteins). This model was trained using over 11 million chromatin structures from lab experiments.
By using AI, scientists can now study DNA folding much faster and more efficiently than before. This breakthrough could lead to new discoveries about how gene activity is controlled in different cells, helping us better understand health and disease.
Press release: Massachusetts Institute of Technology