Genetically Sequencing Healthy Babies Yielded Surprising Results
Today in Melrose, Massachusetts, Cora Stetson is the picture of good health, a bubbly precocious 2-year-old. But Cora has two separate mutations in the gene that produces a critical enzyme called biotinidase and her body produces only 40 percent of the normal levels of that enzyme.
In the last few years, the dream of predicting and preventing diseases through genomics, starting in childhood, is finally within reach.
That's enough to pass conventional newborn (heelstick) screening, but may not be enough for normal brain development, putting baby Cora at risk for seizures and cognitive impairment. But thanks to an experimental study in which Cora's DNA was sequenced after birth, this condition was discovered and she is being treated with a safe and inexpensive vitamin supplement.
Stories like these are beginning to emerge from the BabySeq Project, the first clinical trial in the world to systematically sequence healthy newborn infants. This trial was led by my research group with funding from the National Institutes of Health. While still controversial, it is pointing the way to a future in which adults, or even newborns, can receive comprehensive genetic analysis in order to determine their risk of future disease and enable opportunities to prevent them.
Some believe that medicine is still not ready for genomic population screening, but others feel it is long overdue. After all, the sequencing of the Human Genome Project was completed in 2003, and with this milestone, it became feasible to sequence and interpret the genome of any human being. The costs have come down dramatically since then; an entire human genome can now be sequenced for about $800, although the costs of bioinformatic and medical interpretation can add another $200 to $2000 more, depending upon the number of genes interrogated and the sophistication of the interpretive effort.
Two-year-old Cora Stetson, whose DNA sequencing after birth identified a potentially dangerous genetic mutation in time for her to receive preventive treatment.
(Photo courtesy of Robert Green)
The ability to sequence the human genome yielded extraordinary benefits in scientific discovery, disease diagnosis, and targeted cancer treatment. But the ability of genomes to detect health risks in advance, to actually predict the medical future of an individual, has been mired in controversy and slow to manifest. In particular, the oft-cited vision that healthy infants could be genetically tested at birth in order to predict and prevent the diseases they would encounter, has proven to be far tougher to implement than anyone anticipated.
But in the last few years, the dream of predicting and preventing diseases through genomics, starting in childhood, is finally within reach. Why did it take so long? And what remains to be done?
Great Expectations
Part of the problem was the unrealistic expectations that had been building for years in advance of the genomic science itself. For example, the 1997 film Gattaca portrayed a near future in which the lifetime risk of disease was readily predicted the moment an infant is born. In the fanfare that accompanied the completion of the Human Genome Project, the notion of predicting and preventing future disease in an individual became a powerful meme that was used to inspire investment and public support for genomic research long before the tools were in place to make it happen.
Another part of the problem was the success of state-mandated newborn screening programs that began in the 1960's with biochemical tests of the "heel-stick" for babies with metabolic disorders. These programs have worked beautifully, costing only a few dollars per baby and saving thousands of infants from death and severe cognitive impairment. It seemed only logical that a new technology like genome sequencing would add power and promise to such programs. But instead of embracing the notion of newborn sequencing, newborn screening laboratories have thus far rejected the entire idea as too expensive, too ambiguous, and too threatening to the comfortable constituency that they had built within the public health framework.
"What can you find when you look as deeply as possible into the medical genomes of healthy individuals?"
Creating the Evidence Base for Preventive Genomics
Despite a number of obstacles, there are researchers who are exploring how to achieve the original vision of genomic testing as a tool for disease prediction and prevention. For example, in our NIH-funded MedSeq Project, we were the first to ask the question: "What can you find when you look as deeply as possible into the medical genomes of healthy individuals?"
Most people do not understand that genetic information comes in four separate categories: 1) dominant mutations putting the individual at risk for rare conditions like familial forms of heart disease or cancer, (2) recessive mutations putting the individual's children at risk for rare conditions like cystic fibrosis or PKU, (3) variants across the genome that can be tallied to construct polygenic risk scores for common conditions like heart disease or type 2 diabetes, and (4) variants that can influence drug metabolism or predict drug side effects such as the muscle pain that occasionally occurs with statin use.
The technological and analytical challenges of our study were formidable, because we decided to systematically interrogate over 5000 disease-associated genes and report results in all four categories of genetic information directly to the primary care physicians for each of our volunteers. We enrolled 200 adults and found that everyone who was sequenced had medically relevant polygenic and pharmacogenomic results, over 90 percent carried recessive mutations that could have been important to reproduction, and an extraordinary 14.5 percent carried dominant mutations for rare genetic conditions.
A few years later we launched the BabySeq Project. In this study, we restricted the number of genes to include only those with child/adolescent onset that could benefit medically from early warning, and even so, we found 9.4 percent carried dominant mutations for rare conditions.
At first, our interpretation around the high proportion of apparently healthy individuals with dominant mutations for rare genetic conditions was simple – that these conditions had lower "penetrance" than anticipated; in other words, only a small proportion of those who carried the dominant mutation would get the disease. If this interpretation were to hold, then genetic risk information might be far less useful than we had hoped.
Suddenly the information available in the genome of even an apparently healthy individual is looking more robust, and the prospect of preventive genomics is looking feasible.
But then we circled back with each adult or infant in order to examine and test them for any possible features of the rare disease in question. When we did this, we were surprised to see that in over a quarter of those carrying such mutations, there were already subtle signs of the disease in question that had not even been suspected! Now our interpretation was different. We now believe that genetic risk may be responsible for subclinical disease in a much higher proportion of people than has ever been suspected!
Meanwhile, colleagues of ours have been demonstrating that detailed analysis of polygenic risk scores can identify individuals at high risk for common conditions like heart disease. So adding up the medically relevant results in any given genome, we start to see that you can learn your risks for a rare monogenic condition, a common polygenic condition, a bad effect from a drug you might take in the future, or for having a child with a devastating recessive condition. Suddenly the information available in the genome of even an apparently healthy individual is looking more robust, and the prospect of preventive genomics is looking feasible.
Preventive Genomics Arrives in Clinical Medicine
There is still considerable evidence to gather before we can recommend genomic screening for the entire population. For example, it is important to make sure that families who learn about such risks do not suffer harms or waste resources from excessive medical attention. And many doctors don't yet have guidance on how to use such information with their patients. But our research is convincing many people that preventive genomics is coming and that it will save lives.
In fact, we recently launched a Preventive Genomics Clinic at Brigham and Women's Hospital where information-seeking adults can obtain predictive genomic testing with the highest quality interpretation and medical context, and be coached over time in light of their disease risks toward a healthier outcome. Insurance doesn't yet cover such testing, so patients must pay out of pocket for now, but they can choose from a menu of genetic screening tests, all of which are more comprehensive than consumer-facing products. Genetic counseling is available but optional. So far, this service is for adults only, but sequencing for children will surely follow soon.
As the costs of sequencing and other Omics technologies continue to decline, we will see both responsible and irresponsible marketing of genetic testing, and we will need to guard against unscientific claims. But at the same time, we must be far more imaginative and fast moving in mainstream medicine than we have been to date in order to claim the emerging benefits of preventive genomics where it is now clear that suffering can be averted, and lives can be saved. The future has arrived if we are bold enough to grasp it.
Funding and Disclosures:
Dr. Green's research is supported by the National Institutes of Health, the Department of Defense and through donations to The Franca Sozzani Fund for Preventive Genomics. Dr. Green receives compensation for advising the following companies: AIA, Applied Therapeutics, Helix, Ohana, OptraHealth, Prudential, Verily and Veritas; and is co-founder and advisor to Genome Medical, Inc, a technology and services company providing genetics expertise to patients, providers, employers and care systems.
[Editor's Note: This is the fifth episode in our Moonshot series, which explores cutting-edge scientific developments that stand to fundamentally transform our world.]
Kira Peikoff was the editor-in-chief of Leaps.org from 2017 to 2021. As a journalist, her work has appeared in The New York Times, Newsweek, Nautilus, Popular Mechanics, The New York Academy of Sciences, and other outlets. She is also the author of four suspense novels that explore controversial issues arising from scientific innovation: Living Proof, No Time to Die, Die Again Tomorrow, and Mother Knows Best. Peikoff holds a B.A. in Journalism from New York University and an M.S. in Bioethics from Columbia University. She lives in New Jersey with her husband and two young sons. Follow her on Twitter @KiraPeikoff.
With the pandemic at the forefront of everyone's minds, many people have wondered if food could be a source of coronavirus transmission. Luckily, that "seems unlikely," according to the CDC, but foodborne illnesses do still sicken a whopping 48 million people per year.
Whole genome sequencing is like "going from an eight-bit image—maybe like what you would see in Minecraft—to a high definition image."
In normal times, when there isn't a historic global health crisis infecting millions and affecting the lives of billions, foodborne outbreaks are real and frightening, potentially deadly, and can cause widespread fear of particular foods. Think of Romaine lettuce spreading E. coli last year— an outbreak that infected more than 500 people and killed eight—or peanut butter spreading salmonella in 2008, which infected 167 people.
The technologies available to detect and prevent the next foodborne disease outbreak have improved greatly over the past 30-plus years, particularly during the past decade, and better, more nimble technologies are being developed, according to experts in government, academia, and private industry. The key to advancing detection of harmful foodborne pathogens, they say, is increasing speed and portability of detection, and the precision of that detection.
Getting to Rapid Results
Researchers at Purdue University have recently developed a lateral flow assay that, with the help of a laser, can detect toxins and pathogenic E. coli. Lateral flow assays are cheap and easy to use; a good example is a home pregnancy test. You place a liquid or liquefied sample on a piece of paper designed to detect a single substance and soon after you get the results in the form of a colored line: yes or no.
"They're a great portable tool for us for food contaminant detection," says Carmen Gondhalekar, a fifth-year biomedical engineering graduate student at Purdue. "But one of the areas where paper-based lateral flow assays could use improvement is in multiplexing capability and their sensitivity."
J. Paul Robinson, a professor in Purdue's Colleges of Veterinary Medicine and Engineering, and Gondhalekar's advisor, agrees. "One of the fundamental problems that we have in detection is that it is hard to identify pathogens in complex samples," he says.
When it comes to foodborne disease outbreaks, you don't always know what substance you're looking for, so an assay made to detect only a single substance isn't always effective. The goal of the project at Purdue is to make assays that can detect multiple substances at once.
These assays would be more complex than a pregnancy test. As detailed in Gondhalekar's recent paper, a laser pulse helps create a spectral signal from the sample on the assay paper, and the spectral signal is then used to determine if any unique wavelengths associated with one of several toxins or pathogens are present in the sample. Though the handheld technology has yet to be built, the idea is that the results would be given on the spot. So someone in the field trying to track the source of a Salmonella infection could, for instance, put a suspected lettuce sample on the assay and see if it has the pathogen on it.
"What our technology is designed to do is to give you a rapid assessment of the sample," says Robinson. "The goal here is speed."
Seeing the Pathogen in "High-Def"
"One in six Americans will get a foodborne illness every year," according to Dr. Heather Carleton, a microbiologist at the Centers for Disease Control and Prevention's Enteric Diseases Laboratory Branch. But not every foodborne outbreak makes the news. In 2017 alone, the CDC monitored between 18 and 37 foodborne poison clusters per week and investigated 200 multi-state clusters. Hardboiled eggs, ground beef, chopped salad kits, raw oysters, frozen tuna, and pre-cut melon are just a taste of the foods that were investigated last year for different strains of listeria, salmonella, and E. coli.
At the heart of the CDC investigations is PulseNet, a national network of laboratories that uses DNA fingerprinting to detect outbreaks at local and regional levels. This is how it works: When a patient gets sick—with symptoms like vomiting and fever, for instance—they will go to a hospital or clinic for treatment. Since we're talking about foodborne illnesses, a clinician will likely take a stool sample from the patient and send it off to a laboratory to see if there is a foodborne pathogen, like salmonella, E. Coli, or another one. If it does contain a potentially harmful pathogen, then a bacterial isolate of that identified sample is sent to a regional public health lab so that whole genome sequencing can be performed.
Whole genome sequencing can differentiate "virtually all" strains of foodborne pathogens, no matter the species, according to the FDA.
Whole genome sequencing is a method for reading the entire genome of a bacterial isolate (or from any organism, for that matter). Instead of working with a couple dozen data points, now you're working with millions of base pairs. Carleton likes to describe it as "going from an eight-bit image—maybe like what you would see in Minecraft—to a high definition image," she says. "It's really an evolution of how we detect foodborne illnesses and identify outbreaks."
If the bacterial isolate matches another in the CDC's database, this means there could be a potential outbreak and an investigation may be started, with the goal of tracking the pathogen to its source.
Whole genome sequencing has been a relatively recent shift in foodborne disease detection. For more than 20 years, the standard technique for analyzing pathogens in foodborne disease outbreaks was pulsed-field gel electrophoresis. This method creates a DNA fingerprint for each sample in the form of a pattern of about 15-30 "bands," with each band representing a piece of DNA. Researchers like Carleton can use this fingerprint to see if two samples are from the same bacteria. The problem is that 15-30 bands are not enough to differentiate all isolates. Some isolates whose bands look very similar may actually come from different sources and some whose bands look different may be from the same source. But if you can see the entire DNA fingerprint, then you don't have that issue. That's where whole genome sequencing comes in.
Although the PulseNet team had piloted whole genome sequencing as early as 2013, it wasn't until July of last year that the transition to using whole genome sequencing for all pathogens was complete. Though whole genome sequencing requires far more computing power to generate, analyze, and compare those millions of data points, the payoff is huge.
Stopping Outbreaks Sooner
The U.S. Food and Drug Administration (FDA) acquired their first whole genome sequencers in 2008, according to Dr. Eric Brown, the Director of the Division of Microbiology in the FDA's Office of Regulatory Science. Since then, through their GenomeTrakr program, a network of more than 60 domestic and international labs, the FDA has sequenced and publicly shared more than 400,000 isolates. "The impact of what whole genome sequencing could do to resolve a foodborne outbreak event was no less impactful than when NASA turned on the Hubble Telescope for the first time," says Brown.
Whole genome sequencing has helped identify strains of Salmonella that prior methods were unable to differentiate. In fact, whole genome sequencing can differentiate "virtually all" strains of foodborne pathogens, no matter the species, according to the FDA. This means it takes fewer clinical cases—fewer sick people—to detect and end an outbreak.
And perhaps the largest benefit of whole genome sequencing is that these detailed sequences—the millions of base pairs—can imply geographic location. The genomic information of bacterial strains can be different depending on the area of the country, helping these public health agencies eventually track the source of outbreaks—a restaurant, a farm, a food-processing center.
Coming Soon: "Lab in a Backpack"
Now that whole genome sequencing has become the go-to technology of choice for analyzing foodborne pathogens, the next step is making the process nimbler and more portable. Putting "the lab in a backpack," as Brown says.
The CDC's Carleton agrees. "Right now, the sequencer we use is a fairly big box that weighs about 60 pounds," she says. "We can't take it into the field."
A company called Oxford Nanopore Technologies is developing handheld sequencers. Their devices are meant to "enable the sequencing of anything by anyone anywhere," according to Dan Turner, the VP of Applications at Oxford Nanopore.
"The sooner that we can see linkages…the sooner the FDA gets in action to mitigate the problem and put in some kind of preventative control."
"Right now, sequencing is very much something that is done by people in white coats in laboratories that are set up for that purpose," says Turner. Oxford Nanopore would like to create a new, democratized paradigm.
The FDA is currently testing these types of portable sequencers. "We're very excited about it. We've done some pilots, to be able to do that sequencing in the field. To actually do it at a pond, at a river, at a canal. To do it on site right there," says Brown. "This, of course, is huge because it means we can have real-time sequencing capability to stay in step with an actual laboratory investigation in the field."
"The timeliness of this information is critical," says Marc Allard, a senior biomedical research officer and Brown's colleague at the FDA. "The sooner that we can see linkages…the sooner the FDA gets in action to mitigate the problem and put in some kind of preventative control."
At the moment, the world is rightly focused on COVID-19. But as the danger of one virus subsides, it's only a matter of time before another pathogen strikes. Hopefully, with new and advancing technology like whole genome sequencing, we can stop the next deadly outbreak before it really gets going.