Scientists Are Building an “AccuWeather” for Germs to Predict Your Risk of Getting the Flu
Applied mathematician Sara del Valle works at the U.S.'s foremost nuclear weapons lab: Los Alamos. Once colloquially called Atomic City, it's a hidden place 45 minutes into the mountains northwest of Santa Fe. Here, engineers developed the first atomic bomb.
Like AccuWeather, an app for disease prediction could help people alter their behavior to live better lives.
Today, Los Alamos still a small science town, though no longer a secret, nor in the business of building new bombs. Instead, it's tasked with, among other things, keeping the stockpile of nuclear weapons safe and stable: not exploding when they're not supposed to (yes, please) and exploding if someone presses that red button (please, no).
Del Valle, though, doesn't work on any of that. Los Alamos is also interested in other kinds of booms—like the explosion of a contagious disease that could take down a city. Predicting (and, ideally, preventing) such epidemics is del Valle's passion. She hopes to develop an app that's like AccuWeather for germs: It would tell you your chance of getting the flu, or dengue or Zika, in your city on a given day. And like AccuWeather, it could help people alter their behavior to live better lives, whether that means staying home on a snowy morning or washing their hands on a sickness-heavy commute.
Sara del Valle of Los Alamos is working to predict and prevent epidemics using data and machine learning.
Since the beginning of del Valle's career, she's been driven by one thing: using data and predictions to help people behave practically around pathogens. As a kid, she'd always been good at math, but when she found out she could use it to capture the tentacular spread of disease, and not just manipulate abstractions, she was hooked.
When she made her way to Los Alamos, she started looking at what people were doing during outbreaks. Using social media like Twitter, Google search data, and Wikipedia, the team started to sift for trends. Were people talking about hygiene, like hand-washing? Or about being sick? Were they Googling information about mosquitoes? Searching Wikipedia for symptoms? And how did those things correlate with the spread of disease?
It was a new, faster way to think about how pathogens propagate in the real world. Usually, there's a 10- to 14-day lag in the U.S. between when doctors tap numbers into spreadsheets and when that information becomes public. By then, the world has moved on, and so has the disease—to other villages, other victims.
"We found there was a correlation between actual flu incidents in a community and the number of searches online and the number of tweets online," says del Valle. That was when she first let herself dream about a real-time forecast, not a 10-days-later backcast. Del Valle's group—computer scientists, mathematicians, statisticians, economists, public health professionals, epidemiologists, satellite analysis experts—has continued to work on the problem ever since their first Twitter parsing, in 2011.
They've had their share of outbreaks to track. Looking back at the 2009 swine flu pandemic, they saw people buying face masks and paying attention to the cleanliness of their hands. "People were talking about whether or not they needed to cancel their vacation," she says, and also whether pork products—which have nothing to do with swine flu—were safe to buy.
At the latest meeting with all the prediction groups, del Valle's flu models took first and second place.
They watched internet conversations during the measles outbreak in California. "There's a lot of online discussion about anti-vax sentiment, and people trying to convince people to vaccinate children and vice versa," she says.
Today, they work on predicting the spread of Zika, Chikungunya, and dengue fever, as well as the plain old flu. And according to the CDC, that latter effort is going well.
Since 2015, the CDC has run the Epidemic Prediction Initiative, a competition in which teams like de Valle's submit weekly predictions of how raging the flu will be in particular locations, along with other ailments occasionally. Michael Johannson is co-founder and leader of the program, which began with the Dengue Forecasting Project. Its goal, he says, was to predict when dengue cases would blow up, when previously an area just had a low-level baseline of sick people. "You'll get this massive epidemic where all of a sudden, instead of 3,000 to 4,000 cases, you have 20,000 cases," he says. "They kind of come out of nowhere."
But the "kind of" is key: The outbreaks surely come out of somewhere and, if scientists applied research and data the right way, they could forecast the upswing and perhaps dodge a bomb before it hit big-time. Questions about how big, when, and where are also key to the flu.
A big part of these projects is the CDC giving the right researchers access to the right information, and the structure to both forecast useful public-health outcomes and to compare how well the models are doing. The extra information has been great for the Los Alamos effort. "We don't have to call departments and beg for data," says del Valle.
When data isn't available, "proxies"—things like symptom searches, tweets about empty offices, satellite images showing a green, wet, mosquito-friendly landscape—are helpful: You don't have to rely on anyone's health department.
At the latest meeting with all the prediction groups, del Valle's flu models took first and second place. But del Valle wants more than weekly numbers on a government website; she wants that weather-app-inspired fortune-teller, incorporating the many diseases you could get today, standing right where you are. "That's our dream," she says.
This plot shows the the correlations between the online data stream, from Wikipedia, and various infectious diseases in different countries. The results of del Valle's predictive models are shown in brown, while the actual number of cases or illness rates are shown in blue.
(Courtesy del Valle)
The goal isn't to turn you into a germophobic agoraphobe. It's to make you more aware when you do go out. "If you know it's going to rain today, you're more likely to bring an umbrella," del Valle says. "When you go on vacation, you always look at the weather and make sure you bring the appropriate clothing. If you do the same thing for diseases, you think, 'There's Zika spreading in Sao Paulo, so maybe I should bring even more mosquito repellent and bring more long sleeves and pants.'"
They're not there yet (don't hold your breath, but do stop touching your mouth). She estimates it's at least a decade away, but advances in machine learning could accelerate that hypothetical timeline. "We're doing baby steps," says del Valle, starting with the flu in the U.S., dengue in Brazil, and other efforts in Colombia, Ecuador, and Canada. "Going from there to forecasting all diseases around the globe is a long way," she says.
But even AccuWeather started small: One man began predicting weather for a utility company, then helping ski resorts optimize their snowmaking. His influence snowballed, and now private forecasting apps, including AccuWeather's, populate phones across the planet. The company's progression hasn't been without controversy—privacy incursions, inaccuracy of long-term forecasts, fights with the government—but it has continued, for better and for worse.
Disease apps, perhaps spun out of a small, unlikely team at a nuclear-weapons lab, could grow and breed in a similar way. And both the controversies and public-health benefits that may someday spin out of them lie in the future, impossible to predict with certainty.
COVID Variants Are Like “a Thief Changing Clothes” – and Our Camera System Barely Exists
Whether it's "natural selection" as Darwin called it, or it's "mutating" as the X-Men called it, living organisms change over time, developing thumbs or more efficient protein spikes, depending on the organism and the demands of its environment. The coronavirus that causes COVID-19, SARS-CoV-2, is not an exception, and now, after the virus has infected millions of people around the globe for more than a year, scientists are beginning to see those changes.
The notorious variants that have popped up include B.1.1.7, sometimes called the UK variant, as well as P.1 and B.1.351, which seem to have emerged in Brazil and South Africa respectively. As vaccinations are picking up pace, officials are warning that now
is not the time to become complacent or relax restrictions because the variants aren't well understood.
Some appear to be more transmissible, and deadlier, while others can evade the immune system's defenses better than earlier versions of the virus, potentially undermining the effectiveness of vaccines to some degree. Genomic surveillance, the process of sequencing the genetic code of the virus widely to observe changes and patterns, is a critical way that scientists can keep track of its evolution and work to understand how the variants might affect humans.
"It's like a thief changing clothes"
It's important to note that viruses mutate all the time. If there were funding and personnel to sequence the genome of every sample of the virus, scientists would see thousands of mutations. Not every variant deserves our attention. The vast majority of mutations are not important at all, but recognizing those that are is a crucial tool in getting and staying ahead of the virus. The work of sequencing, analyzing, observing patterns, and using public health tools as necessary is complicated and confusing to those without years of specialized training.
Jeremy Kamil, associate professor of microbiology and immunology at LSU Health Shreveport, in Louisiana, says that the variants developing are like a thief changing clothes. The thief goes in your house, steals your stuff, then leaves and puts on a different shirt and a wig, in the hopes you won't recognize them. Genomic surveillance catches the "thief" even in those different clothes.
One of the tricky things about variants is recognizing the point at which they move from interesting, to concerning at a local level, to dangerous in a larger context.
Understanding variants, both the uninteresting ones and the potentially concerning ones, gives public health officials and researchers at different levels a useful set of tools. Locally, knowing which variants are circulating in the community helps leaders know whether mask mandates and similar measures should be implemented or discontinued, or whether businesses and schools can open relatively safely.
There's more to it than observing new variants
Analysis is complex, particularly when it comes to understanding which variants are of concern. "So the question is always if a mutation becomes common, is that a random occurrence?" says Phoebe Lostroh, associate professor of molecular biology at Colorado College. "Or is the variant the result of some kind of selection because the mutation changes some property about the virus that makes it reproduce more quickly than variants of the virus that don't have that mutation? For a virus, [mutations can affect outcomes like] how much it replicates inside a person's body, how much somebody breathes it out, whether the particles that somebody might breathe in get smaller and can lead to greater transmission."
Along with all of those factors, accurate and useful genomic surveillance requires an understanding of where variants are occurring, how they are related, and an examination of why they might be prevalent.
For example, if a potentially worrisome variant appears in a community and begins to spread very quickly, it's not time to raise a public health alarm until several important questions have been answered, such as whether the variant is spreading due to specific events, or if it's happening because the mutation has allowed the virus to infect people more efficiently. Kamil offered a hypothetical scenario to explain: Imagine that a member of a community became infected and the virus mutated. That person went to church and three more people were infected, but one of them went to a karaoke bar and while singing infected 100 other people. Examining the conditions under which the virus has spread is, therefore, an essential part of untangling whether a mutation itself made the virus more transmissible or if an infected person's behaviors contributed to a local outbreak.
One of the tricky things about variants is recognizing the point at which they move from interesting, to concerning at a local level, to dangerous in a larger context. Genomic sequencing can help with that, but only when it's coordinated. When the same mutation occurs frequently, but is localized to one region, it's a concern, but when the same mutation happens in different places at the same time, it's much more likely that the "virus is learning that's a good mutation," explains Kamil.
The process is called convergent evolution, and it was a fascinating topic long before COVID. Just as your heritage can be traced through DNA, so can that of viruses, and when separate lineages develop similar traits it's almost like scientists can see evolution happening in real time. A mutation to SARS-CoV-2 that happens in more than one place at once is a mutation that makes it easier in some way for the virus to survive and that is when it may become alarming. The widespread, documented variants P.1 and B.1.351 are examples of convergence because they share some of the same virulent mutations despite having developed thousands of miles apart.
However, even variants that are emerging in different places at the same time don't present the kind of threat SARS-CoV-2 did in 2019. "This is nature," says Kamil. "It just means that this virus will not easily be driven to extinction or complete elimination by vaccines." Although a person who has already had COVID-19 can be reinfected with a variant, "it is almost always much milder disease" than the original infection, Kamil adds. Rather than causing full-fledged disease, variants have the potiental to "penetrate herd immunity, spreading relatively quietly among people who have developed natural immunity or been vaccinated, until the virus finds someone who has no immunity yet, and that person would be at risk of hospitalization-grade severe disease or death."
Surveillance and predictions
According to Lostroh, genomic surveillance can help scientists predict what's going to happen. "With the British strain, for instance, that's more transmissible, you can measure how fast it's doubling in the population and you can sort of tell whether we should take more measures against this mutation. Should we shut things down a little longer because that mutation is present in the population? That could be really useful if you did enough sampling in the population that you knew where it was," says Lostroh. If, for example, the more transmissible strain was present in 50 percent of cases, but in another county or state it was barely present, it would allow for rolling lockdowns instead of sweeping measures.
Variants are also extremely important when it comes to the development, manufacture, and distribution of vaccines. "You're also looking at medical countermeasures, such as whether your vaccine is still effective, or if your antiviral needs to be updated," says Lane Warmbrod, a senior analyst and research associate at Johns Hopkins Center for Health Security.
Properly funded and extensive genomic surveillance could eventually help control endemic diseases, too, like the seasonal flu, or other common respiratory infections. Kamil says he envisions a future in which genomic surveillance allows for prediction of sickness just as the weather is predicted today. "It's a 51 for infection today at the San Francisco Airport. There's been detection of some respiratory viruses," he says, offering an example. He says that if you're a vulnerable person, if you're immune-suppressed for some reason, you may want to wear a mask based on the sickness report.
The U.S. has the ability, but lacks standards
The benefits of widespread genomic surveillance are clear, and the United States certainly has the necessary technology, equipment, and personnel to carry it out. But, it's not happening at the speed and extent it needs to for the country to gain the benefits.
"The numbers are improving," said Kamil. "We're probably still at less than half a percent of all the samples that have been taken have been sequenced since the beginning of the pandemic."
Although there's no consensus on how many sequences is ideal for a robust surveillance program, modeling performed by the company Illumina suggests about 5 percent of positive tests should be sequenced. The reasons the U.S. has lagged in implementing a sequencing program are complex and varied, but solvable.
Perhaps the most important element that is currently missing is leadership. In order to conduct an effective genomic surveillance program, there need to be standards. The Johns Hopkins Center for Health Security recently published a paper with recommendations as to what kinds of elements need to be standardized in order to make the best use of sequencing technology and analysis.
"Along with which bioinformatic pipelines you're going to use to do the analyses, which sequencing strategy protocol are you going to use, what's your sampling strategy going to be, how is the data is going to be reported, what data gets reported," says Warmbrod. Currently, there's no guidance from the CDC on any of those things. So, while scientists can collect and report information, they may be collecting and reporting different information that isn't comparable, making it less useful for public health measures and vaccine updates.
Globally, one of the most important tools in making the information from genomic surveillance useful is GISAID, a platform designed for scientists to share -- and, importantly, to be credited for -- their data regarding genetic sequences of influenza. Originally, it was launched as a database of bird flu sequences, but has evolved to become an essential tool used by the WHO to make flu vaccine virus recommendations each year. Scientists who share their credentials have free access to the database, and anyone who uses information from the database must credit the scientist who uploaded that information.
Safety, logistics, and funding matter
Scientists at university labs and other small organizations have been uploading sequences to GISAID almost from the beginning of the pandemic, but their funding is generally limited, and there are no standards regarding information collection or reporting. Private, for-profit labs haven't had motivation to set up sequencing programs, although many of them have the logistical capabilities and funding to do so. Public health departments are understaffed, underfunded, and overwhelmed.
University labs may also be limited by safety concerns. The SARS-CoV-2 virus is dangerous, and there's a question of how samples should be transported to labs for sequencing.
Larger, for-profit organizations often have the tools and distribution capabilities to safely collect and sequence samples, but there hasn't been a profit motive. Genomic sequencing is less expensive now than ever before, but even at $100 per sample, the cost adds up -- not to mention the cost of employing a scientist with the proper credentials to analyze the sequence.
The path forward
The recently passed COVID-19 relief bill does have some funding to address genomic sequencing. Specifically, the American Rescue Plan Act includes $1.75 billion in funding for the Centers for Disease Control and Prevention's Advanced Molecular Detection (AMD) program. In an interview last month, CDC Director Rochelle Walensky said that the additional funding will be "a dial. And we're going to need to dial it up." AMD has already announced a collaboration called the Sequencing for Public Health Emergency Response, Epidemiology, and Surveillance (SPHERES) Initiative that will bring together scientists from public health, academic, clinical, and non-profit laboratories across the country with the goal of accelerating sequencing.
Such a collaboration is a step toward following the recommendations in the paper Warmbrod coauthored. Building capacity now, creating a network of labs, and standardizing procedures will mean improved health in the future. "I want to be optimistic," she says. "The good news is there are a lot of passionate, smart, capable people who are continuing to work with government and work with different stakeholders." She cautions, however, that without a national strategy we won't succeed.
"If we maximize the potential and create that framework now, we can also use it for endemic diseases," she says. "It's a very helpful system for more than COVID if we're smart in how we plan it."
Since the beginning of life on Earth, plants have been naturally converting sunlight into energy. This photosynthesis process that's effortless for them has been anything but for scientists who have been trying to achieve artificial photosynthesis for the last half a century with the goal of creating a carbon-neutral fuel. Such a fuel could be a gamechanger — rather than putting CO2 back into the atmosphere like traditional fuels do, it would take CO2 out of the atmosphere and convert it into usable energy.
If given the option between a carbon-neutral fuel at the gas station and a fuel that produces carbon dioxide in spades -- and if costs and effectiveness were equal --who wouldn't choose the one best for the planet? That's the endgame scientists are after. A consumer switch to clean fuel could have a huge impact on our global CO2 emissions.
Up until this point, the methods used to make liquid fuel from atmospheric CO2 have been expensive, not efficient enough to really get off the ground, and often resulted in unwanted byproducts. But now, a new technology may be the key to unlocking the full potential of artificial photosynthesis. At the very least, it's a step forward and could help make a dent in atmospheric CO2 reduction.
"It's an important breakthrough in artificial photosynthesis," says Qian Wang, a researcher in the Department of Chemistry at Cambridge University and lead author on a recent study published in Nature about an innovation she calls "photosheets."
The latest version of the artificial leaf directly produces liquid fuel, which is easier to transport and use commercially.
These photosheets convert CO2, sunlight, and water into a carbon-neutral liquid fuel called formic acid without the aid of electricity. They're made of semiconductor powders that absorb sunlight. When in the presence of water and CO2, the electrons in the powders become excited and join with the CO2 and protons from the water molecules, reducing the CO2 in the process. The chemical reaction results in the production of formic acid, which can be used directly or converted to hydrogen, another clean energy fuel.
In the past, it's been difficult to reduce CO2 without creating a lot of unwanted byproducts. According to Wang, this new conversion process achieves the reduction and fuel creation with almost no byproducts.
The Cambridge team's new technology is a first and certainly momentous, but they're far from the only team to have produced fuel from CO2 using some form of artificial photosynthesis. More and more scientists are aiming to perfect the method in hopes of producing a truly sustainable, photosynthetic fuel capable of lowering carbon emissions.
Thanks to advancements in nanoscience, which has led to better control of materials, more successes are emerging. A team at the University of Illinois at Urbana-Champaign, for example, used gold nanoparticles as the photocatalysts in their process.
"My group demonstrated that you could actually use gold nanoparticles both as a light absorber and a catalyst in the process of converting carbon dioxide to hydrocarbons such as methane, ethane and propane fuels," says professor Prashant Jain, co-author of the study. Not only are gold nanoparticles great at absorbing light, they don't degrade as quickly as other metals, which makes them more sustainable.
That said, Jain's team, like every other research team working on artificial photosynthesis including the Cambridge team, is grappling with efficiency issues. Jain says that all parts of the process need to be optimized so the reaction can happen as quickly as possible.
"You can't just improve one [aspect], because that can lead to a decrease in performance in some other aspects," Jain explains.
The Cambridge team is currently experimenting with a range of catalysts to improve their device's stability and efficiency. Virgil Andrei, who is working on an artificial leaf design that was developed at Cambridge in 2019, was recently able to improve the performance and selectivity of the device. Now the leaf's solar-to-CO2 energy conversion efficiency is 0.2%, twice its previous efficiency.
The latest version also directly produces liquid fuel, which is easier to transport and use commercially.
In determining a method of fuel production's efficiency, one must consider how sustainable it is at every stage. That involves calculating whenever excess energy is needed to complete a step. According to Jain, in order to use CO2 for fuel production, you have to condense the CO2, which takes energy. And on the fuel production side, once the chemical reaction has created your byproducts, they need to be separated, which also takes energy.
To be truly sustainable, each part of the conversion system also needs to be durable. If parts need to be replaced often, or regularly maintained, that counts against it. Then you have to account for the system's reuse cycle. If you extract CO2 from the environment and convert it into fuel that's then put into a fuel cell, it's going to release CO2 at the other end. In order to create a fully green, carbon-neutral fuel source, that same amount of CO2 needs to be trapped and reintroduced back into the fuel conversion system.
"The cycle continues, and at each point, you will see a loss in efficiency, and depending on how much you [may also] see a loss in yield," says Jain. "And depending on what those efficiencies are at each one of those points will determine whether or not this process can be sustainable."
The science is at least a decade away from offering a competitive sustainable fuel option at scale. Streamlining a process to mimic what plants have perfected over billions of years is no small feat, but an ever-growing community of researchers using rapidly advancing technology is driving progress forward.