Scientists Are Building an “AccuWeather” for Germs to Predict Your Risk of Getting the Flu
Applied mathematician Sara del Valle works at the U.S.'s foremost nuclear weapons lab: Los Alamos. Once colloquially called Atomic City, it's a hidden place 45 minutes into the mountains northwest of Santa Fe. Here, engineers developed the first atomic bomb.
Like AccuWeather, an app for disease prediction could help people alter their behavior to live better lives.
Today, Los Alamos still a small science town, though no longer a secret, nor in the business of building new bombs. Instead, it's tasked with, among other things, keeping the stockpile of nuclear weapons safe and stable: not exploding when they're not supposed to (yes, please) and exploding if someone presses that red button (please, no).
Del Valle, though, doesn't work on any of that. Los Alamos is also interested in other kinds of booms—like the explosion of a contagious disease that could take down a city. Predicting (and, ideally, preventing) such epidemics is del Valle's passion. She hopes to develop an app that's like AccuWeather for germs: It would tell you your chance of getting the flu, or dengue or Zika, in your city on a given day. And like AccuWeather, it could help people alter their behavior to live better lives, whether that means staying home on a snowy morning or washing their hands on a sickness-heavy commute.
Sara del Valle of Los Alamos is working to predict and prevent epidemics using data and machine learning.
Since the beginning of del Valle's career, she's been driven by one thing: using data and predictions to help people behave practically around pathogens. As a kid, she'd always been good at math, but when she found out she could use it to capture the tentacular spread of disease, and not just manipulate abstractions, she was hooked.
When she made her way to Los Alamos, she started looking at what people were doing during outbreaks. Using social media like Twitter, Google search data, and Wikipedia, the team started to sift for trends. Were people talking about hygiene, like hand-washing? Or about being sick? Were they Googling information about mosquitoes? Searching Wikipedia for symptoms? And how did those things correlate with the spread of disease?
It was a new, faster way to think about how pathogens propagate in the real world. Usually, there's a 10- to 14-day lag in the U.S. between when doctors tap numbers into spreadsheets and when that information becomes public. By then, the world has moved on, and so has the disease—to other villages, other victims.
"We found there was a correlation between actual flu incidents in a community and the number of searches online and the number of tweets online," says del Valle. That was when she first let herself dream about a real-time forecast, not a 10-days-later backcast. Del Valle's group—computer scientists, mathematicians, statisticians, economists, public health professionals, epidemiologists, satellite analysis experts—has continued to work on the problem ever since their first Twitter parsing, in 2011.
They've had their share of outbreaks to track. Looking back at the 2009 swine flu pandemic, they saw people buying face masks and paying attention to the cleanliness of their hands. "People were talking about whether or not they needed to cancel their vacation," she says, and also whether pork products—which have nothing to do with swine flu—were safe to buy.
At the latest meeting with all the prediction groups, del Valle's flu models took first and second place.
They watched internet conversations during the measles outbreak in California. "There's a lot of online discussion about anti-vax sentiment, and people trying to convince people to vaccinate children and vice versa," she says.
Today, they work on predicting the spread of Zika, Chikungunya, and dengue fever, as well as the plain old flu. And according to the CDC, that latter effort is going well.
Since 2015, the CDC has run the Epidemic Prediction Initiative, a competition in which teams like de Valle's submit weekly predictions of how raging the flu will be in particular locations, along with other ailments occasionally. Michael Johannson is co-founder and leader of the program, which began with the Dengue Forecasting Project. Its goal, he says, was to predict when dengue cases would blow up, when previously an area just had a low-level baseline of sick people. "You'll get this massive epidemic where all of a sudden, instead of 3,000 to 4,000 cases, you have 20,000 cases," he says. "They kind of come out of nowhere."
But the "kind of" is key: The outbreaks surely come out of somewhere and, if scientists applied research and data the right way, they could forecast the upswing and perhaps dodge a bomb before it hit big-time. Questions about how big, when, and where are also key to the flu.
A big part of these projects is the CDC giving the right researchers access to the right information, and the structure to both forecast useful public-health outcomes and to compare how well the models are doing. The extra information has been great for the Los Alamos effort. "We don't have to call departments and beg for data," says del Valle.
When data isn't available, "proxies"—things like symptom searches, tweets about empty offices, satellite images showing a green, wet, mosquito-friendly landscape—are helpful: You don't have to rely on anyone's health department.
At the latest meeting with all the prediction groups, del Valle's flu models took first and second place. But del Valle wants more than weekly numbers on a government website; she wants that weather-app-inspired fortune-teller, incorporating the many diseases you could get today, standing right where you are. "That's our dream," she says.
This plot shows the the correlations between the online data stream, from Wikipedia, and various infectious diseases in different countries. The results of del Valle's predictive models are shown in brown, while the actual number of cases or illness rates are shown in blue.
(Courtesy del Valle)
The goal isn't to turn you into a germophobic agoraphobe. It's to make you more aware when you do go out. "If you know it's going to rain today, you're more likely to bring an umbrella," del Valle says. "When you go on vacation, you always look at the weather and make sure you bring the appropriate clothing. If you do the same thing for diseases, you think, 'There's Zika spreading in Sao Paulo, so maybe I should bring even more mosquito repellent and bring more long sleeves and pants.'"
They're not there yet (don't hold your breath, but do stop touching your mouth). She estimates it's at least a decade away, but advances in machine learning could accelerate that hypothetical timeline. "We're doing baby steps," says del Valle, starting with the flu in the U.S., dengue in Brazil, and other efforts in Colombia, Ecuador, and Canada. "Going from there to forecasting all diseases around the globe is a long way," she says.
But even AccuWeather started small: One man began predicting weather for a utility company, then helping ski resorts optimize their snowmaking. His influence snowballed, and now private forecasting apps, including AccuWeather's, populate phones across the planet. The company's progression hasn't been without controversy—privacy incursions, inaccuracy of long-term forecasts, fights with the government—but it has continued, for better and for worse.
Disease apps, perhaps spun out of a small, unlikely team at a nuclear-weapons lab, could grow and breed in a similar way. And both the controversies and public-health benefits that may someday spin out of them lie in the future, impossible to predict with certainty.
This Boy Struggled to Walk Before Gene Therapy. Now, Such Treatments Are Poised to Explode.
Conner Curran was diagnosed with Duchenne's muscular dystrophy in 2015 when he was four years old. It's the most severe form of the genetic disease, with a nearly inevitable progression toward total paralysis. Many Duchenne's patients die in their teens; the average lifespan is 26.
But Conner, who is now 10, has experienced some astonishing improvements in recent years. He can now walk for more than two miles at a time – an impossible journey when he was younger.
In 2018, Conner became the very first patient to receive gene therapy specific to treating Duchenne's. In the initial clinical trial of nine children, nearly 80 percent reacted positively to the treatment). A larger-scale stage 3 clinical trial is currently underway, with initial results expected next year.
Gene therapy involves altering the genes in an individual's cells to stop or treat a disease. Such a procedure may be performed by adding new gene material to existing cells, or editing the defective genes to improve their functionality.
That the medical world is on the cusp of a successful treatment for a crippling and deadly disease is the culmination of more than 35 years of work by Dr. Jude Samulski, a professor of pharmacology at the University of North Carolina School of Medicine in Chapel Hill. More recently, he's become a leading gene therapy entrepreneur.
But Samulski likens this breakthrough to the frustrations of solving a Rubik's cube. "Just because one side is now all the color yellow does not mean that it is completely aligned," he says.
Although Conner's life and future have dramatically improved, he's not cured. The gene therapy tamed but did not extinguish his disorder: Conner is now suffering from the equivalent of Becker's muscular dystrophy, a milder form of the disease with symptoms that appear later in life and progress more slowly. Moreover, the loss of muscle cells Conner suffered prior to the treatment is permanent.
"It will take more time and more innovations," Samulski says of finding an even more effective gene therapy for muscular dystrophy.
Conner's family is still overjoyed with the results. "Jude's grit and determination gave Conner a chance at a new life, one that was not in his cards before gene therapy," says his mother Jessica Curran. She adds that "Conner is more confident than before and enjoys life, even though he has limitations, if compared to his brothers or peers."
Conner Curran holding a football post gene therapy treatment.
Courtesy of the Curran family
For now, the use of gene therapy as a treatment for diseases and disorders remains relatively isolated. On paper at least, progress appears glacially slow. In 2018, there were four FDA-approved gene therapies (excluding those reliant on bone marrow/stem cell transplants or implants). Today, there are 10. One therapy is solely for the cosmetic purpose of reducing facial lines and folds.
Nevertheless, experts in the space believe gene therapy is poised to expand dramatically.
"Certainly in the next three to five years you will see dozens of gene therapies and cell therapies be approved," says Dr. Pavan Cheruvu, who is CEO of Sio Gene Therapies in New York. The company is developing treatments for Parkinson's disease and Tay-Sachs, among other diseases.
Cheruvu's conclusion is supported by NEWDIGS, a think tank at the Massachusetts Institute of Technology that keeps tabs on gene therapy developments. NEWDIGS predicts there will be at least 60 gene therapies approved for use in the U.S. by the end of the decade. That number could be closer to 100 if Chinese researchers and biotech ventures decide the American market is a good fit for the therapies they develop.
"We are watching something of a conditional evolution, like a dot-com, or cellphones that were sizes of shoeboxes that have now matured to the size of wafers. Our space will follow along very similarly."
Dr. Carsten Brunn, a chemist by training and CEO of Selecta Biosciences outside of Boston, is developing ways to reduce the immune responses in patients who receive gene therapy. He observes that there are more than 300 therapies in development and thousands of clinical trials underway. "It's definitely an exciting time in the field," he says.
That's a far cry from the environment of little more than a decade ago. Research and investment in gene therapy had been brought low for years after the death of teenager Jesse Gelsinger in 1999 while he had been enrolled in a clinical trial to treat a liver disease. Gene therapy was a completely novel concept back then, and his death created existential questions about whether it was a proper pathway to pursue. Cheruvu, a cardiologist, calls the years after Gelsinger's death an "ice age" for gene therapy.
However, those dark years eventually yielded to a thaw. And while there have been some recent stumbles, they are considered part of the trial-and-error that has often accompanied medical research as opposed to an ominous "stop" sign.
The deaths of three patients last year receiving gene therapy for myotubular myopathy – a degenerative disease that causes severe muscle weakness – promptly ended the clinical trial in which they were enrolled. However, the incident caused few ripples beyond that. Researchers linked the deaths to dosage sizes that caused liver toxicity, as opposed to the gene therapy itself being an automatic death sentence; younger patients who received lower doses due to a less advanced disease state experienced improvements.
The gene sequencing and editing that helped create vaccines for COVID-19 in record time also bolstered the argument for more investment in research and development. Cheruvu notes that the field has usually been the domain of investors with significant expertise in the field; these days, more money is flowing in from generalists.
The Challenges Ahead
What will be the next step in gene therapy's evolution? Many of Samulski's earliest innovations came in the laboratory, for example. Then that led to him forming a company called AskBio in collaboration with the Muscular Dystrophy Association. AskBio sold its gene therapy to Pfizer five years ago to assure that enough could be manufactured for stage 3 clinical trials and eventually reach the market.
Cheruvu suggests that many future gene therapy innovations will be the result of what he calls "congruent innovation." That means publicly funded laboratories and privately funded companies might develop treatments separately or in collaboration. Or, university scientists may depend on private ventures to solve one of gene therapy's most vexing issues: producing enough finished material to test and treat on a large scale. "Manufacturing is a real bottleneck right now," Brunn says.
The alternative is referred to in the sector as the "valley of death": a lab has found a promising treatment, but is not far enough along in development to submit an investigational new drug application with the FDA. The promise withers away as a result. But the new abundance of venture capital for gene therapy has made this scenario less of an issue for private firms, some of which have received hundreds of millions of dollars in funding.
There are also numerous clinical challenges. Many gene therapies use what are known as adeno-associated virus vectors (AAVs) to deliver treatments. They are hollowed-out husks of viruses that can cause a variety of mostly mild maladies ranging from colds to pink eye. They are modified to deliver the genetic material used in the therapy. Most of these vectors trigger an antibody reaction that limits treatments to a single does or a handful of smaller ones. That can limit the potential progress for patients – an issue referred to as treatment "durability."
Although vectors from animals such as horses trigger far less of an antibody reaction in patients -- and there has been significant work done on using artificial vectors -- both are likely years away from being used on a large scale. "For the foreseeable future, AAV is the delivery system of choice," Brunn says.
Also, there will likely be demand for concurrent gene therapies that can lead to a complete cure – not only halting the progress of Duchenne's in kids like Conner Curran, but regenerating their lost muscle cells, perhaps through some form of stem cell therapy or another treatment that has yet to be devised.
Nevertheless, Samulski believes demand for imperfect treatments will be high – particularly with a disease such as muscular dystrophy, where many patients are mere months from spending the remainder of their lives in wheelchairs. But Samulski believes those therapies will also inevitably evolve into something far more effective.
"We are watching something of a conditional evolution, like a dot-com, or cellphones that were sizes of shoeboxes that have now matured to the size of wafers," he says. "Our space will follow along very similarly."
Jessica Curran will remain forever grateful for what her son has received: "Jude gave us new hope. He gave us something that is priceless – a chance to watch Conner grow up and live out his own dreams."
COVID Variants Are Like “a Thief Changing Clothes” – and Our Camera System Barely Exists
Whether it's "natural selection" as Darwin called it, or it's "mutating" as the X-Men called it, living organisms change over time, developing thumbs or more efficient protein spikes, depending on the organism and the demands of its environment. The coronavirus that causes COVID-19, SARS-CoV-2, is not an exception, and now, after the virus has infected millions of people around the globe for more than a year, scientists are beginning to see those changes.
The notorious variants that have popped up include B.1.1.7, sometimes called the UK variant, as well as P.1 and B.1.351, which seem to have emerged in Brazil and South Africa respectively. As vaccinations are picking up pace, officials are warning that now
is not the time to become complacent or relax restrictions because the variants aren't well understood.
Some appear to be more transmissible, and deadlier, while others can evade the immune system's defenses better than earlier versions of the virus, potentially undermining the effectiveness of vaccines to some degree. Genomic surveillance, the process of sequencing the genetic code of the virus widely to observe changes and patterns, is a critical way that scientists can keep track of its evolution and work to understand how the variants might affect humans.
"It's like a thief changing clothes"
It's important to note that viruses mutate all the time. If there were funding and personnel to sequence the genome of every sample of the virus, scientists would see thousands of mutations. Not every variant deserves our attention. The vast majority of mutations are not important at all, but recognizing those that are is a crucial tool in getting and staying ahead of the virus. The work of sequencing, analyzing, observing patterns, and using public health tools as necessary is complicated and confusing to those without years of specialized training.
Jeremy Kamil, associate professor of microbiology and immunology at LSU Health Shreveport, in Louisiana, says that the variants developing are like a thief changing clothes. The thief goes in your house, steals your stuff, then leaves and puts on a different shirt and a wig, in the hopes you won't recognize them. Genomic surveillance catches the "thief" even in those different clothes.
One of the tricky things about variants is recognizing the point at which they move from interesting, to concerning at a local level, to dangerous in a larger context.
Understanding variants, both the uninteresting ones and the potentially concerning ones, gives public health officials and researchers at different levels a useful set of tools. Locally, knowing which variants are circulating in the community helps leaders know whether mask mandates and similar measures should be implemented or discontinued, or whether businesses and schools can open relatively safely.
There's more to it than observing new variants
Analysis is complex, particularly when it comes to understanding which variants are of concern. "So the question is always if a mutation becomes common, is that a random occurrence?" says Phoebe Lostroh, associate professor of molecular biology at Colorado College. "Or is the variant the result of some kind of selection because the mutation changes some property about the virus that makes it reproduce more quickly than variants of the virus that don't have that mutation? For a virus, [mutations can affect outcomes like] how much it replicates inside a person's body, how much somebody breathes it out, whether the particles that somebody might breathe in get smaller and can lead to greater transmission."
Along with all of those factors, accurate and useful genomic surveillance requires an understanding of where variants are occurring, how they are related, and an examination of why they might be prevalent.
For example, if a potentially worrisome variant appears in a community and begins to spread very quickly, it's not time to raise a public health alarm until several important questions have been answered, such as whether the variant is spreading due to specific events, or if it's happening because the mutation has allowed the virus to infect people more efficiently. Kamil offered a hypothetical scenario to explain: Imagine that a member of a community became infected and the virus mutated. That person went to church and three more people were infected, but one of them went to a karaoke bar and while singing infected 100 other people. Examining the conditions under which the virus has spread is, therefore, an essential part of untangling whether a mutation itself made the virus more transmissible or if an infected person's behaviors contributed to a local outbreak.
One of the tricky things about variants is recognizing the point at which they move from interesting, to concerning at a local level, to dangerous in a larger context. Genomic sequencing can help with that, but only when it's coordinated. When the same mutation occurs frequently, but is localized to one region, it's a concern, but when the same mutation happens in different places at the same time, it's much more likely that the "virus is learning that's a good mutation," explains Kamil.
The process is called convergent evolution, and it was a fascinating topic long before COVID. Just as your heritage can be traced through DNA, so can that of viruses, and when separate lineages develop similar traits it's almost like scientists can see evolution happening in real time. A mutation to SARS-CoV-2 that happens in more than one place at once is a mutation that makes it easier in some way for the virus to survive and that is when it may become alarming. The widespread, documented variants P.1 and B.1.351 are examples of convergence because they share some of the same virulent mutations despite having developed thousands of miles apart.
However, even variants that are emerging in different places at the same time don't present the kind of threat SARS-CoV-2 did in 2019. "This is nature," says Kamil. "It just means that this virus will not easily be driven to extinction or complete elimination by vaccines." Although a person who has already had COVID-19 can be reinfected with a variant, "it is almost always much milder disease" than the original infection, Kamil adds. Rather than causing full-fledged disease, variants have the potiental to "penetrate herd immunity, spreading relatively quietly among people who have developed natural immunity or been vaccinated, until the virus finds someone who has no immunity yet, and that person would be at risk of hospitalization-grade severe disease or death."
Surveillance and predictions
According to Lostroh, genomic surveillance can help scientists predict what's going to happen. "With the British strain, for instance, that's more transmissible, you can measure how fast it's doubling in the population and you can sort of tell whether we should take more measures against this mutation. Should we shut things down a little longer because that mutation is present in the population? That could be really useful if you did enough sampling in the population that you knew where it was," says Lostroh. If, for example, the more transmissible strain was present in 50 percent of cases, but in another county or state it was barely present, it would allow for rolling lockdowns instead of sweeping measures.
Variants are also extremely important when it comes to the development, manufacture, and distribution of vaccines. "You're also looking at medical countermeasures, such as whether your vaccine is still effective, or if your antiviral needs to be updated," says Lane Warmbrod, a senior analyst and research associate at Johns Hopkins Center for Health Security.
Properly funded and extensive genomic surveillance could eventually help control endemic diseases, too, like the seasonal flu, or other common respiratory infections. Kamil says he envisions a future in which genomic surveillance allows for prediction of sickness just as the weather is predicted today. "It's a 51 for infection today at the San Francisco Airport. There's been detection of some respiratory viruses," he says, offering an example. He says that if you're a vulnerable person, if you're immune-suppressed for some reason, you may want to wear a mask based on the sickness report.
The U.S. has the ability, but lacks standards
The benefits of widespread genomic surveillance are clear, and the United States certainly has the necessary technology, equipment, and personnel to carry it out. But, it's not happening at the speed and extent it needs to for the country to gain the benefits.
"The numbers are improving," said Kamil. "We're probably still at less than half a percent of all the samples that have been taken have been sequenced since the beginning of the pandemic."
Although there's no consensus on how many sequences is ideal for a robust surveillance program, modeling performed by the company Illumina suggests about 5 percent of positive tests should be sequenced. The reasons the U.S. has lagged in implementing a sequencing program are complex and varied, but solvable.
Perhaps the most important element that is currently missing is leadership. In order to conduct an effective genomic surveillance program, there need to be standards. The Johns Hopkins Center for Health Security recently published a paper with recommendations as to what kinds of elements need to be standardized in order to make the best use of sequencing technology and analysis.
"Along with which bioinformatic pipelines you're going to use to do the analyses, which sequencing strategy protocol are you going to use, what's your sampling strategy going to be, how is the data is going to be reported, what data gets reported," says Warmbrod. Currently, there's no guidance from the CDC on any of those things. So, while scientists can collect and report information, they may be collecting and reporting different information that isn't comparable, making it less useful for public health measures and vaccine updates.
Globally, one of the most important tools in making the information from genomic surveillance useful is GISAID, a platform designed for scientists to share -- and, importantly, to be credited for -- their data regarding genetic sequences of influenza. Originally, it was launched as a database of bird flu sequences, but has evolved to become an essential tool used by the WHO to make flu vaccine virus recommendations each year. Scientists who share their credentials have free access to the database, and anyone who uses information from the database must credit the scientist who uploaded that information.
Safety, logistics, and funding matter
Scientists at university labs and other small organizations have been uploading sequences to GISAID almost from the beginning of the pandemic, but their funding is generally limited, and there are no standards regarding information collection or reporting. Private, for-profit labs haven't had motivation to set up sequencing programs, although many of them have the logistical capabilities and funding to do so. Public health departments are understaffed, underfunded, and overwhelmed.
University labs may also be limited by safety concerns. The SARS-CoV-2 virus is dangerous, and there's a question of how samples should be transported to labs for sequencing.
Larger, for-profit organizations often have the tools and distribution capabilities to safely collect and sequence samples, but there hasn't been a profit motive. Genomic sequencing is less expensive now than ever before, but even at $100 per sample, the cost adds up -- not to mention the cost of employing a scientist with the proper credentials to analyze the sequence.
The path forward
The recently passed COVID-19 relief bill does have some funding to address genomic sequencing. Specifically, the American Rescue Plan Act includes $1.75 billion in funding for the Centers for Disease Control and Prevention's Advanced Molecular Detection (AMD) program. In an interview last month, CDC Director Rochelle Walensky said that the additional funding will be "a dial. And we're going to need to dial it up." AMD has already announced a collaboration called the Sequencing for Public Health Emergency Response, Epidemiology, and Surveillance (SPHERES) Initiative that will bring together scientists from public health, academic, clinical, and non-profit laboratories across the country with the goal of accelerating sequencing.
Such a collaboration is a step toward following the recommendations in the paper Warmbrod coauthored. Building capacity now, creating a network of labs, and standardizing procedures will mean improved health in the future. "I want to be optimistic," she says. "The good news is there are a lot of passionate, smart, capable people who are continuing to work with government and work with different stakeholders." She cautions, however, that without a national strategy we won't succeed.
"If we maximize the potential and create that framework now, we can also use it for endemic diseases," she says. "It's a very helpful system for more than COVID if we're smart in how we plan it."