Bad Actors Getting Your Health Data Is the FBI’s Latest Worry
In February 2015, the health insurer Anthem revealed that criminal hackers had gained access to the company's servers, exposing the personal information of nearly 79 million patients. It's the largest known healthcare breach in history.
FBI agents worry that the vast amounts of healthcare data being generated for precision medicine efforts could leave the U.S. vulnerable to cyber and biological attacks.
That year, the data of millions more would be compromised in one cyberattack after another on American insurers and other healthcare organizations. In fact, for the past several years, the number of reported data breaches has increased each year, from 199 in 2010 to 344 in 2017, according to a September 2018 analysis in the Journal of the American Medical Association.
The FBI's Edward You sees this as a worrying trend. He says hackers aren't just interested in your social security or credit card number. They're increasingly interested in stealing your medical information. Hackers can currently use this information to make fake identities, file fraudulent insurance claims, and order and sell expensive drugs and medical equipment. But beyond that, a new kind of cybersecurity threat is around the corner.
Mr. You and others worry that the vast amounts of healthcare data being generated for precision medicine efforts could leave the U.S. vulnerable to cyber and biological attacks. In the wrong hands, this data could be used to exploit or extort an individual, discriminate against certain groups of people, make targeted bioweapons, or give another country an economic advantage.
Precision medicine, of course, is the idea that medical treatments can be tailored to individuals based on their genetics, environment, lifestyle or other traits. But to do that requires collecting and analyzing huge quantities of health data from diverse populations. One research effort, called All of Us, launched by the U.S. National Institutes of Health last year, aims to collect genomic and other healthcare data from one million participants with the goal of advancing personalized medical care.
Other initiatives are underway by academic institutions and healthcare organizations. Electronic medical records, genetic tests, wearable health trackers, mobile apps, and social media are all sources of valuable healthcare data that a bad actor could potentially use to learn more about an individual or group of people.
"When you aggregate all of that data together, that becomes a very powerful profile of who you are," Mr. You says.
A supervisory special agent in the biological countermeasures unit within the FBI's weapons of mass destruction directorate, it's Mr. You's job to imagine worst-case bioterror scenarios and figure out how to prevent and prepare for them.
That used to mean focusing on threats like anthrax, Ebola, and smallpox—pathogens that could be used to intentionally infect people—"basically the dangerous bugs," as he puts it. In recent years, advances in gene editing and synthetic biology have given rise to fears that rogue, or even well-intentioned, scientists could create a virulent virus that's intentionally, or unintentionally, released outside the lab.
"If a foreign source, especially a criminal one, has your biological information, then they might have some particular insights into what your future medical needs might be and exploit that."
While Mr. You is still tracking those threats, he's been traveling around the country talking to scientists, lawyers, software engineers, cyber security professionals, government officials and CEOs about new security threats—those posed by genetic and other biological data.
Emerging threats
Mr. You says one possible situation he can imagine is the potential for nefarious actors to use an individual's sensitive medical information to extort or blackmail that person.
"If a foreign source, especially a criminal one, has your biological information, then they might have some particular insights into what your future medical needs might be and exploit that," he says. For instance, "what happens if you have a singular medical condition and an outside entity says they have a treatment for your condition?" You could get talked into paying a huge sum of money for a treatment that ends up being bogus.
Or what if hackers got a hold of a politician or high-profile CEO's health records? Say that person had a disease-causing genetic mutation that could affect their ability to carry out their job in the future and hackers threatened to expose that information. These scenarios may seem far-fetched, but Mr. You thinks they're becoming increasingly plausible.
On a wider scale, Kavita Berger, a scientist at Gryphon Scientific, a Washington, D.C.-area life sciences consulting firm, worries that data from different populations could be used to discriminate against certain groups of people, like minorities and immigrants.
For instance, the advocacy group Human Rights Watch in 2017 flagged a concerning trend in China's Xinjiang territory, a region with a history of government repression. Police there had purchased 12 DNA sequencers and were collecting and cataloging DNA samples from people to build a national database.
"The concern is that this particular province has a huge population of the Muslim minority in China," Ms. Berger says. "Now they have a really huge database of genetic sequences. You have to ask, why does a police station need 12 next-generation sequencers?"
Also alarming is the potential that large amounts of data from different groups of people could lead to customized bioweapons if that data ends up in the wrong hands.
Eleonore Pauwels, a research fellow on emerging cybertechnologies at United Nations University's Centre for Policy Research, says new insights gained from genomic and other data will give scientists a better understanding of how diseases occur and why certain people are more susceptible to certain diseases.
"As you get more and more knowledge about the genomic picture and how the microbiome and the immune system of different populations function, you could get a much deeper understanding about how you could target different populations for treatment but also how you could eventually target them with different forms of bioagents," Ms. Pauwels says.
Economic competitiveness
Another reason hackers might want to gain access to large genomic and other healthcare datasets is to give their country a leg up economically. Many large cyber-attacks on U.S. healthcare organizations have been tied to Chinese hacking groups.
"This is a biological space race and we just haven't woken up to the fact that we're in this race."
"It's becoming clear that China is increasingly interested in getting access to massive data sets that come from different countries," Ms. Pauwels says.
A year after U.S. President Barack Obama conceived of the Precision Medicine Initiative in 2015—later renamed All of Us—China followed suit, announcing the launch of a 15-year, $9 billion precision health effort aimed at turning China into a global leader in genomics.
Chinese genomics companies, too, are expanding their reach outside of Asia. One company, WuXi NextCODE, which has offices in Shanghai, Reykjavik, and Cambridge, Massachusetts, has built an extensive library of genomes from the U.S., China and Iceland, and is now setting its sights on Ireland.
Another Chinese company, BGI, has partnered with Children's Hospital of Philadelphia and Sinai Health System in Toronto, and also formed a collaboration with the Smithsonian Institute to sequence all species on the planet. BGI has built its own advanced genomic sequencing machines to compete with U.S.-based Illumina.
Mr. You says having access to all this data could lead to major breakthroughs in healthcare, such as new blockbuster drugs. "Whoever has the largest, most diverse dataset is truly going to win the day and come up with something very profitable," he says.
Some direct-to-consumer genetic testing companies with offices in the U.S., like Dante Labs, also use BGI to process customers' DNA.
Experts worry that China could race ahead the U.S. in precision medicine because of Chinese laws governing data sharing. Currently, China prohibits the exportation of genetic data without explicit permission from the government. Mr. You says this creates an asymmetry in data sharing between the U.S. and China.
"This is a biological space race and we just haven't woken up to the fact that we're in this race," he said in January at an American Society for Microbiology conference in Washington, D.C. "We don't have access to their data. There is absolutely no reciprocity."
Protecting your data
While Mr. You has been stressing the importance of data security to anyone who will listen, the National Academies of Sciences, Engineering, and Medicine, which makes scientific and policy recommendations on issues of national importance, has commissioned a study on "safeguarding the bioeconomy."
In the meantime, Ms. Berger says organizations that deal with people's health data should assess their security risks and identify potential vulnerabilities in their systems.
As for what individuals can do to protect themselves, she urges people to think about the different ways they're sharing healthcare data—such as via mobile health apps and wearables.
"Ask yourself, what's the benefit of sharing this? What are the potential consequences of sharing this?" she says.
Mr. You also cautions people to think twice before taking consumer DNA tests. They may seem harmless, he says, but at the end of the day, most people don't know where their genetic information is going. "If your genetic sequence is taken, once it's gone, it's gone. There's nothing you can do about it."
Podcast: The Friday Five weekly roundup in health research
The Friday Five covers five stories in health research that you may have missed this week. There are plenty of controversies and troubling ethical issues in science – and we get into many of them in our online magazine – but this news roundup focuses on scientific creativity and progress to give you a therapeutic dose of inspiration headed into the weekend.
Covered in this week's Friday Five:
- Sex differences in cancer
- Promising research on a vaccine for Lyme disease
- Using a super material for brain-like devices
- Measuring your immunity to Covid
- Reducing dementia risk with leisure activities
One day in recent past, scientists at Columbia University’s Creative Machines Lab set up a robotic arm inside a circle of five streaming video cameras and let the robot watch itself move, turn and twist. For about three hours the robot did exactly that—it looked at itself this way and that, like toddlers exploring themselves in a room full of mirrors. By the time the robot stopped, its internal neural network finished learning the relationship between the robot’s motor actions and the volume it occupied in its environment. In other words, the robot built a spatial self-awareness, just like humans do. “We trained its deep neural network to understand how it moved in space,” says Boyuan Chen, one of the scientists who worked on it.
For decades robots have been doing helpful tasks that are too hard, too dangerous, or physically impossible for humans to carry out themselves. Robots are ultimately superior to humans in complex calculations, following rules to a tee and repeating the same steps perfectly. But even the biggest successes for human-robot collaborations—those in manufacturing and automotive industries—still require separating the two for safety reasons. Hardwired for a limited set of tasks, industrial robots don't have the intelligence to know where their robo-parts are in space, how fast they’re moving and when they can endanger a human.
Over the past decade or so, humans have begun to expect more from robots. Engineers have been building smarter versions that can avoid obstacles, follow voice commands, respond to human speech and make simple decisions. Some of them proved invaluable in many natural and man-made disasters like earthquakes, forest fires, nuclear accidents and chemical spills. These disaster recovery robots helped clean up dangerous chemicals, looked for survivors in crumbled buildings, and ventured into radioactive areas to assess damage.
Now roboticists are going a step further, training their creations to do even better: understand their own image in space and interact with humans like humans do. Today, there are already robot-teachers like KeeKo, robot-pets like Moffin, robot-babysitters like iPal, and robotic companions for the elderly like Pepper.
But even these reasonably intelligent creations still have huge limitations, some scientists think. “There are niche applications for the current generations of robots,” says professor Anthony Zador at Cold Spring Harbor Laboratory—but they are not “generalists” who can do varied tasks all on their own, as they mostly lack the abilities to improvise, make decisions based on a multitude of facts or emotions, and adjust to rapidly changing circumstances. “We don’t have general purpose robots that can interact with the world. We’re ages away from that.”
Robotic spatial self-awareness – the achievement by the team at Columbia – is an important step toward creating more intelligent machines. Hod Lipson, professor of mechanical engineering who runs the Columbia lab, says that future robots will need this ability to assist humans better. Knowing how you look and where in space your parts are, decreases the need for human oversight. It also helps the robot to detect and compensate for damage and keep up with its own wear-and-tear. And it allows robots to realize when something is wrong with them or their parts. “We want our robots to learn and continue to grow their minds and bodies on their own,” Chen says. That’s what Zador wants too—and on a much grander level. “I want a robot who can drive my car, take my dog for a walk and have a conversation with me.”
Columbia scientists have trained a robot to become aware of its own "body," so it can map the right path to touch a ball without running into an obstacle, in this case a square.
Jane Nisselson and Yinuo Qin/ Columbia Engineering
Today’s technological advances are making some of these leaps of progress possible. One of them is the so-called Deep Learning—a method that trains artificial intelligence systems to learn and use information similar to how humans do it. Described as a machine learning method based on neural network architectures with multiple layers of processing units, Deep Learning has been used to successfully teach machines to recognize images, understand speech and even write text.
Trained by Google, one of these language machine learning geniuses, BERT, can finish sentences. Another one called GPT3, designed by San Francisco-based company OpenAI, can write little stories. Yet, both of them still make funny mistakes in their linguistic exercises that even a child wouldn’t. According to a paper published by Stanford’s Center for Research on Foundational Models, BERT seems to not understand the word “not.” When asked to fill in the word after “A robin is a __” it correctly answers “bird.” But try inserting the word “not” into that sentence (“A robin is not a __”) and BERT still completes it the same way. Similarly, in one of its stories, GPT3 wrote that if you mix a spoonful of grape juice into your cranberry juice and drink the concoction, you die. It seems that robots, and artificial intelligence systems in general, are still missing some rudimentary facts of life that humans and animals grasp naturally and effortlessly.
How does one give robots a genome? Zador has an idea. We can’t really equip machines with real biological nucleotide-based genes, but we can mimic the neuronal blueprint those genes create.
It's not exactly the robots’ fault. Compared to humans, and all other organisms that have been around for thousands or millions of years, robots are very new. They are missing out on eons of evolutionary data-building. Animals and humans are born with the ability to do certain things because they are pre-wired in them. Flies know how to fly, fish knows how to swim, cats know how to meow, and babies know how to cry. Yet, flies don’t really learn to fly, fish doesn’t learn to swim, cats don’t learn to meow, and babies don’t learn to cry—they are born able to execute such behaviors because they’re preprogrammed to do so. All that happens thanks to the millions of years of evolutions wired into their respective genomes, which give rise to the brain’s neural networks responsible for these behaviors. Robots are the newbies, missing out on that trove of information, Zador argues.
A neuroscience professor who studies how brain circuitry generates various behaviors, Zador has a different approach to developing the robotic mind. Until their creators figure out a way to imbue the bots with that information, robots will remain quite limited in their abilities. Each model will only be able to do certain things it was programmed to do, but it will never go above and beyond its original code. So Zador argues that we have to start giving robots a genome.
How does one do that? Zador has an idea. We can’t really equip machines with real biological nucleotide-based genes, but we can mimic the neuronal blueprint those genes create. Genomes lay out rules for brain development. Specifically, the genome encodes blueprints for wiring up our nervous system—the details of which neurons are connected, the strength of those connections and other specs that will later hold the information learned throughout life. “Our genomes serve as blueprints for building our nervous system and these blueprints give rise to a human brain, which contains about 100 billion neurons,” Zador says.
If you think what a genome is, he explains, it is essentially a very compact and compressed form of information storage. Conceptually, genomes are similar to CliffsNotes and other study guides. When students read these short summaries, they know about what happened in a book, without actually reading that book. And that’s how we should be designing the next generation of robots if we ever want them to act like humans, Zador says. “We should give them a set of behavioral CliffsNotes, which they can then unwrap into brain-like structures.” Robots that have such brain-like structures will acquire a set of basic rules to generate basic behaviors and use them to learn more complex ones.
Currently Zador is in the process of developing algorithms that function like simple rules that generate such behaviors. “My algorithms would write these CliffsNotes, outlining how to solve a particular problem,” he explains. “And then, the neural networks will use these CliffsNotes to figure out which ones are useful and use them in their behaviors.” That’s how all living beings operate. They use the pre-programmed info from their genetics to adapt to their changing environments and learn what’s necessary to survive and thrive in these settings.
For example, a robot’s neural network could draw from CliffsNotes with “genetic” instructions for how to be aware of its own body or learn to adjust its movements. And other, different sets of CliffsNotes may imbue it with the basics of physical safety or the fundamentals of speech.
At the moment, Zador is working on algorithms that are trying to mimic neuronal blueprints for very simple organisms—such as earthworms, which have only 302 neurons and about 7000 synapses compared to the millions we have. That’s how evolution worked, too—expanding the brains from simple creatures to more complex to the Homo Sapiens. But if it took millions of years to arrive at modern humans, how long would it take scientists to forge a robot with human intelligence? That’s a billion-dollar question. Yet, Zador is optimistic. “My hypotheses is that if you can build simple organisms that can interact with the world, then the higher level functions will not be nearly as challenging as they currently are.”
Lina Zeldovich has written about science, medicine and technology for Popular Science, Smithsonian, National Geographic, Scientific American, Reader’s Digest, the New York Times and other major national and international publications. A Columbia J-School alumna, she has won several awards for her stories, including the ASJA Crisis Coverage Award for Covid reporting, and has been a contributing editor at Nautilus Magazine. In 2021, Zeldovich released her first book, The Other Dark Matter, published by the University of Chicago Press, about the science and business of turning waste into wealth and health. You can find her on http://linazeldovich.com/ and @linazeldovich.