The Brave New World of Using DNA to Store Data
Netscape co-founder-turned-venture capitalist billionaire investor Marc Andreessen once posited that software was eating the world. He was right, and the takeover of software resulted in many things. One of them is data. Lots and lots and lots of data. In the previous two years, humanity created more data than it did during its entire existence combined, and the amount will only increase. Think about it: The hundreds of 50KB emails you write a day, the dozens of 10MB photos, the minute-long, 350MB 4K video you shoot on your iPhone X add up to vast quantities of information. All that information needs to be stored. And that's becoming an issue as data volume outpaces storage space.
The race is on to find another medium capable of storing massive amounts of information in as small a space as possible.
"There won't be enough silicon to store all the data we need. It's unlikely that we can make flash memory smaller. We have reached the physical limits," Victor Zhirnov, chief scientist at the Semiconductor Research Corporation, says. "We are facing a crisis that's comparable to the oil crisis in the 1970s. By 2050, we're going to need to store 10 to the 30 bits, compared to 10 to the 23 bits in 2016." That amount of storage space is equivalent to each of the world's seven billion people owning almost six trillion -- that's 10 to the 12th power -- iPhone Xs with 256GB storage space.
The race is on to find another medium capable of storing massive amounts of information in as small a space as possible. Zhirnov and other scientists are looking at the human body, looking to DNA. "Nature has nailed it," Luis Ceze, a professor in the Department of Computer Science and Engineering at the University of Washington, says. "DNA is a molecular storage medium that is remarkable. It's incredibly dense, many, many thousands of times denser than the densest technology that we have today. And DNA is remarkably general. Any information you can map in bits you can store in DNA." It's so dense -- able to store a theoretical maximum of 215 petabytes (215 million gigabytes) in a single gram -- that all the data ever produced could be stored in the back of a tractor trailer truck.
Writing DNA can be an energy-efficient process, too. Consider how the human body is constantly writing and rewriting DNA, and does so on a couple thousand calories a day. And all it needs for storage is a cool, dark place, a significant energy savings when compared to server farms that require huge amounts of energy to run and even more energy to cool.
Picture it: tiny specks of inert DNA made from silicon or another material, stored in cool, dark, dry areas, preserved for all time.
Researchers first succeeded in encoding data onto DNA in 2012, when Harvard University geneticists George Church and Sri Kosuri wrote a 52,000-word book on A, C, G, and T base pairs. Their method only produced 1.28 petabytes per gram of DNA, however, a volume exceeded the next year when a group encoded all 154 Shakespeare sonnets and a 26-second clip of Martin Luther King's "I Have A Dream" speech. In 2017, Columbia University researchers Yaniv Erlich and Dina Zielinski made the process 60 percent more efficient.
The limiting factor today is cost. Erlich said the work his team did cost $7,000 to encode and decode two megabytes of data. To become useful in a widespread way, the price per megabyte needs to plummet. Even advocates concede this point. "Of course it is expensive," Zhirnov says. "But look how much magnetic storage cost in the 1980s. What you store today in your iPhone for virtually nothing would cost many millions of dollars in 1982." There's reason to think the price will continue to fall. Genome readers are improving, getting cheaper, faster, and smaller, and genome sequencing becomes cheaper every year, too. Picture it: tiny specks of inert DNA made from silicon or another material, stored in cool, dark, dry areas, preserved for all time.
"It just takes a few minutes to double a sample. A few more minutes, you double it again. Very quickly, you have thousands or millions of new copies."
Plus, DNA has another advantage over more traditional forms of storage: It's very easy to reproduce. "If you want a second copy of a hard disk drive, you need components for a disk drive, hook both drives up to a computer, and copy. That's a pain," Nick Goldman, a researcher at the European Bioinformatics Institute, says. "DNA, once you have that first sample, it's a process that is absolutely routine in thousands of laboratories around the world to multiply that using polymerase chain reaction [which uses temperature changes or other processes]. It just takes a few minutes to double a sample. A few more minutes, you double it again. Very quickly, you have thousands or millions of new copies."
This ability to duplicate quickly and easily is a positive trait. But, of course, there's also the potential for danger. Does encoding on DNA, the very basis for life, present ethical issues? Could it get out of control and fundamentally alter life as we know it?
The chance is there, but it's remote. The first reason is that storage could be done with only two base pairs, which would serve as replacements for the 0 and 1 digits that make up all digital data. While doing so would decrease the possible density of the storage, it would virtually eliminate the risk that the sequences would be compatible with life.
But even if scientists and researchers choose to use four base pairs, other safeguards are in place that will prevent trouble. According to Ceze, the computer science professor, the snippets of DNA that they write are very short, around 150 nucleotides. This includes the title, the information that's being encoded, and tags to help organize where the snippet should fall in the larger sequence. Furthermore, they generally avoid repeated letters, which dramatically reduces the chance that a protein could be synthesized from the snippet.
"In the future, we'll know enough about someone from a sample of their DNA that we could make a specific poison. That's the danger, not those of us who want to encode DNA for storage."
Inevitably, some DNA will get spilt. "But it's so unlikely that anything that gets created for storage would have a biological interpretation that could interfere with the mechanisms going on in a living organism that it doesn't worry me in the slightest," Goldman says. "We're not of concern for the people who are worried about the ethical issues of synthetic DNA. They are much more concerned about people deliberately engineering anthrax. In the future, we'll know enough about someone from a sample of their DNA that we could make a specific poison. That's the danger, not those of us who want to encode DNA for storage."
In the end, the reality of and risks surrounding encoding on DNA are the same as any scientific advancement: It's another system that is vulnerable to people with bad intentions but not one that is inherently unethical.
"Every human action has some ethical implications," Zhirnov says. "I can use a hammer to build a house or I can use it to harm another person. I don't see why DNA is in any way more or less ethical."
If that house can store all the knowledge in human history, it's worth learning how to build it.
Editor's Note: In response to readers' comments that silicon is one of the earth's most abundant materials, we reached back out to our source, Dr. Victor Zhirnov. He stands by his statement about a coming shortage of silicon, citing this research. The silicon oxide found in beach sand is unsuitable for semiconductors, he says, because the cost of purifying it would be prohibitive. For use in circuit-making, silicon must be refined to a purity of 99.9999999 percent. So the process begins by mining for pure quartz, which can only be found in relatively few places around the world.
Catching colds may help protect kids from Covid
A common cold virus causes the immune system to produce T cells that also provide protection against SARS-CoV-2, according to new research. The study, published last month in PNAS, shows that this effect is most pronounced in young children. The finding may help explain why most young people who have been exposed to the cold-causing coronavirus have not developed serious cases of COVID-19.
One curiosity stood out in the early days of the COVID-19 pandemic – why were so few kids getting sick. Generally young children and the elderly are the most vulnerable to disease outbreaks, particularly viral infections, either because their immune systems are not fully developed or they are starting to fail.
But solid information on the new infection was so scarce that many public health officials acted on the precautionary principle, assumed a worst-case scenario, and applied the broadest, most restrictive policies to all people to try to contain the coronavirus SARS-CoV-2.
One early thought was that lockdowns worked and kids (ages 6 months to 17 years) simply were not being exposed to the virus. So it was a shock when data started to come in showing that well over half of them carried antibodies to the virus, indicating exposure without getting sick. That trend grew over time and the latest tracking data from the CDC shows that 96.3 percent of kids in the U.S. now carry those antibodies.
Antibodies are relatively quick and easy to measure, but some scientists are exploring whether the reactions of T cells could serve as a more useful measure of immune protection.
But that couldn't be the whole story because antibody protection fades, sometimes as early as a month after exposure and usually within a year. Additionally, SARS-CoV-2 has been spewing out waves of different variants that were more resistant to antibodies generated by their predecessors. The resistance was so significant that over time the FDA withdrew its emergency use authorization for a handful of monoclonal antibodies with earlier approval to treat the infection because they no longer worked.
Antibodies got most of the attention early on because they are part of the first line response of the immune system. Antibodies can bind to viruses and neutralize them, preventing infection. They are relatively quick and easy to measure and even manufacture, but as SARS-CoV-2 showed us, often viruses can quickly evolve to become more resistant to them. Some scientists are exploring whether the reactions of T cells could serve as a more useful measure of immune protection.
Kids, colds and T cells
T cells are part of the immune system that deals with cells once they have become infected. But working with T cells is much more difficult, takes longer, and is more expensive than working with antibodies. So studies often lags behind on this part of the immune system.
A group of researchers led by Annika Karlsson at the Karolinska Institute in Sweden focuses on T cells targeting virus-infected cells and, unsurprisingly, saw that they can play a role in SARS-CoV-2 infection. Other labs have shown that vaccination and natural exposure to the virus generates different patterns of T cell responses.
The Swedes also looked at another member of the coronavirus family, OC43, which circulates widely and is one of several causes of the common cold. The molecular structure of OC43 is similar to its more deadly cousin SARS-CoV-2. Sometimes a T cell response to one virus can produce a cross-reactive response to a similar protein structure in another virus, meaning that T cells will identify and respond to the two viruses in much the same way. Karlsson looked to see if T cells for OC43 from a wide age range of patients were cross-reactive to SARS-CoV-2.
And that is what they found, as reported in the PNAS study last month; there was cross-reactive activity, but it depended on a person’s age. A subset of a certain type of T cells, called mCD4+,, that recognized various protein parts of the cold-causing virus, OC43, expressed on the surface of an infected cell – also recognized those same protein parts from SARS-CoV-2. The T cell response was lower than that generated by natural exposure to SARS-CoV-2, but it was functional and thus could help limit the severity of COVID-19.
“One of the most politicized aspects of our pandemic response was not accepting that children are so much less at risk for severe disease with COVID-19,” because usually young children are among the most vulnerable to pathogens, says Monica Gandhi, professor of medicine at the University of California San Francisco.
“The cross-reactivity peaked at age six when more than half the people tested have a cross-reactive immune response,” says Karlsson, though their sample is too small to say if this finding applies more broadly across the population. The vast majority of children as young as two years had OC43-specific mCD4+ T cell responses. In adulthood, the functionality of both the OC43-specific and the cross-reactive T cells wane significantly, especially with advanced age.
“Considering that the mortality rate in children is the lowest from ages five to nine, and higher in younger children, our results imply that cross-reactive mCD4+ T cells may have a role in the control of SARS-CoV-2 infection in children,” the authors wrote in their paper.
“One of the most politicized aspects of our pandemic response was not accepting that children are so much less at risk for severe disease with COVID-19,” because usually young children are among the most vulnerable to pathogens, says Monica Gandhi, professor of medicine at the University of California San Francisco and author of the book, Endemic: A Post-Pandemic Playbook, to be released by the Mayo Clinic Press this summer. The immune response of kids to SARS-CoV-2 stood our expectations on their head. “We just haven't seen this before, so knowing the mechanism of protection is really important.”
Why the T cell immune response can fade with age is largely unknown. With some viruses such as measles, a single vaccination or infection generates life-long protection. But respiratory tract infections, like SARS-CoV-2, cause a localized infection - specific to certain organs - and that response tends to be shorter lived than systemic infections that affect the entire body. Karlsson suspects the elderly might be exposed to these localized types of viruses less often. Also, frequent continued exposure to a virus that results in reactivation of the memory T cell pool might eventually result in “a kind of immunosenescence or immune exhaustion that is associated with aging,” Karlsson says. https://leaps.org/scientists-just-started-testing-a-new-class-of-drugs-to-slow-and-even-reverse-aging/particle-3 This fading protection is why older people need to be repeatedly vaccinated against SARS-CoV-2.
Policy implications
Following the numbers on COVID-19 infections and severity over the last three years have shown us that healthy young people without risk factors are not likely to develop serious disease. This latest study points to a mechanism that helps explain why. But the inertia of existing policies remains. How should we adjust policy recommendations based on what we know today?
The World Health Organization (WHO) updated their COVID-19 vaccination guidance on March 28. It calls for a focus on vaccinating and boosting those at risk for developing serious disease. The guidance basically shrugged its shoulders when it came to healthy children and young adults receiving vaccinations and boosters against COVID-19. It said the priority should be to administer the “traditional essential vaccines for children,” such as those that protect against measles, rubella, and mumps.
“As an immunologist and a mother, I think that catching a cold or two when you are a kid and otherwise healthy is not that bad for you. Children have a much lower risk of becoming severely ill with SARS-CoV-2,” says Karlsson. She has followed public health guidance in Sweden, which means that her young children have not been vaccinated, but being older, she has received the vaccine and boosters. Gandhi and her children have been vaccinated, but they do not plan on additional boosters.
The WHO got it right in “concentrating on what matters,” which is getting traditional childhood immunizations back on track after their dramatic decline over the last three years, says Gandhi. Nor is there a need for masking in schools, according to a study from the Catalonia region of Spain. It found “no difference in masking and spread in schools,” particularly since tracking data indicate that nearly all young people have been exposed to SARS-CoV-2.
Both researchers lament that public discussion has overemphasized the quickly fading antibody part of the immune response to SARS-CoV-2 compared with the more durable T cell component. They say developing an efficient measure of T cell response for doctors to use in the clinic would help to monitor immunity in people at risk for severe cases of COVID-19 compared with the current method of toting up potential risk factors.
The Friday Five covers five stories in research that you may have missed this week. There are plenty of controversies and troubling ethical issues in science – and we get into many of them in our online magazine – but this news roundup focuses on new scientific theories and progress to give you a therapeutic dose of inspiration headed into the weekend.
Listen on Apple | Listen on Spotify | Listen on Stitcher | Listen on Amazon | Listen on Google
Here are the stories covered this week:
- The eyes are the windows to the soul - and biological aging?
- What bean genes mean for health and the planet
- This breathing practice could lower levels of tau proteins
- AI beats humans at assessing heart health
- Should you get a nature prescription?