Bad Actors Getting Your Health Data Is the FBI’s Latest Worry
In February 2015, the health insurer Anthem revealed that criminal hackers had gained access to the company's servers, exposing the personal information of nearly 79 million patients. It's the largest known healthcare breach in history.
FBI agents worry that the vast amounts of healthcare data being generated for precision medicine efforts could leave the U.S. vulnerable to cyber and biological attacks.
That year, the data of millions more would be compromised in one cyberattack after another on American insurers and other healthcare organizations. In fact, for the past several years, the number of reported data breaches has increased each year, from 199 in 2010 to 344 in 2017, according to a September 2018 analysis in the Journal of the American Medical Association.
The FBI's Edward You sees this as a worrying trend. He says hackers aren't just interested in your social security or credit card number. They're increasingly interested in stealing your medical information. Hackers can currently use this information to make fake identities, file fraudulent insurance claims, and order and sell expensive drugs and medical equipment. But beyond that, a new kind of cybersecurity threat is around the corner.
Mr. You and others worry that the vast amounts of healthcare data being generated for precision medicine efforts could leave the U.S. vulnerable to cyber and biological attacks. In the wrong hands, this data could be used to exploit or extort an individual, discriminate against certain groups of people, make targeted bioweapons, or give another country an economic advantage.
Precision medicine, of course, is the idea that medical treatments can be tailored to individuals based on their genetics, environment, lifestyle or other traits. But to do that requires collecting and analyzing huge quantities of health data from diverse populations. One research effort, called All of Us, launched by the U.S. National Institutes of Health last year, aims to collect genomic and other healthcare data from one million participants with the goal of advancing personalized medical care.
Other initiatives are underway by academic institutions and healthcare organizations. Electronic medical records, genetic tests, wearable health trackers, mobile apps, and social media are all sources of valuable healthcare data that a bad actor could potentially use to learn more about an individual or group of people.
"When you aggregate all of that data together, that becomes a very powerful profile of who you are," Mr. You says.
A supervisory special agent in the biological countermeasures unit within the FBI's weapons of mass destruction directorate, it's Mr. You's job to imagine worst-case bioterror scenarios and figure out how to prevent and prepare for them.
That used to mean focusing on threats like anthrax, Ebola, and smallpox—pathogens that could be used to intentionally infect people—"basically the dangerous bugs," as he puts it. In recent years, advances in gene editing and synthetic biology have given rise to fears that rogue, or even well-intentioned, scientists could create a virulent virus that's intentionally, or unintentionally, released outside the lab.
"If a foreign source, especially a criminal one, has your biological information, then they might have some particular insights into what your future medical needs might be and exploit that."
While Mr. You is still tracking those threats, he's been traveling around the country talking to scientists, lawyers, software engineers, cyber security professionals, government officials and CEOs about new security threats—those posed by genetic and other biological data.
Emerging threats
Mr. You says one possible situation he can imagine is the potential for nefarious actors to use an individual's sensitive medical information to extort or blackmail that person.
"If a foreign source, especially a criminal one, has your biological information, then they might have some particular insights into what your future medical needs might be and exploit that," he says. For instance, "what happens if you have a singular medical condition and an outside entity says they have a treatment for your condition?" You could get talked into paying a huge sum of money for a treatment that ends up being bogus.
Or what if hackers got a hold of a politician or high-profile CEO's health records? Say that person had a disease-causing genetic mutation that could affect their ability to carry out their job in the future and hackers threatened to expose that information. These scenarios may seem far-fetched, but Mr. You thinks they're becoming increasingly plausible.
On a wider scale, Kavita Berger, a scientist at Gryphon Scientific, a Washington, D.C.-area life sciences consulting firm, worries that data from different populations could be used to discriminate against certain groups of people, like minorities and immigrants.
For instance, the advocacy group Human Rights Watch in 2017 flagged a concerning trend in China's Xinjiang territory, a region with a history of government repression. Police there had purchased 12 DNA sequencers and were collecting and cataloging DNA samples from people to build a national database.
"The concern is that this particular province has a huge population of the Muslim minority in China," Ms. Berger says. "Now they have a really huge database of genetic sequences. You have to ask, why does a police station need 12 next-generation sequencers?"
Also alarming is the potential that large amounts of data from different groups of people could lead to customized bioweapons if that data ends up in the wrong hands.
Eleonore Pauwels, a research fellow on emerging cybertechnologies at United Nations University's Centre for Policy Research, says new insights gained from genomic and other data will give scientists a better understanding of how diseases occur and why certain people are more susceptible to certain diseases.
"As you get more and more knowledge about the genomic picture and how the microbiome and the immune system of different populations function, you could get a much deeper understanding about how you could target different populations for treatment but also how you could eventually target them with different forms of bioagents," Ms. Pauwels says.
Economic competitiveness
Another reason hackers might want to gain access to large genomic and other healthcare datasets is to give their country a leg up economically. Many large cyber-attacks on U.S. healthcare organizations have been tied to Chinese hacking groups.
"This is a biological space race and we just haven't woken up to the fact that we're in this race."
"It's becoming clear that China is increasingly interested in getting access to massive data sets that come from different countries," Ms. Pauwels says.
A year after U.S. President Barack Obama conceived of the Precision Medicine Initiative in 2015—later renamed All of Us—China followed suit, announcing the launch of a 15-year, $9 billion precision health effort aimed at turning China into a global leader in genomics.
Chinese genomics companies, too, are expanding their reach outside of Asia. One company, WuXi NextCODE, which has offices in Shanghai, Reykjavik, and Cambridge, Massachusetts, has built an extensive library of genomes from the U.S., China and Iceland, and is now setting its sights on Ireland.
Another Chinese company, BGI, has partnered with Children's Hospital of Philadelphia and Sinai Health System in Toronto, and also formed a collaboration with the Smithsonian Institute to sequence all species on the planet. BGI has built its own advanced genomic sequencing machines to compete with U.S.-based Illumina.
Mr. You says having access to all this data could lead to major breakthroughs in healthcare, such as new blockbuster drugs. "Whoever has the largest, most diverse dataset is truly going to win the day and come up with something very profitable," he says.
Some direct-to-consumer genetic testing companies with offices in the U.S., like Dante Labs, also use BGI to process customers' DNA.
Experts worry that China could race ahead the U.S. in precision medicine because of Chinese laws governing data sharing. Currently, China prohibits the exportation of genetic data without explicit permission from the government. Mr. You says this creates an asymmetry in data sharing between the U.S. and China.
"This is a biological space race and we just haven't woken up to the fact that we're in this race," he said in January at an American Society for Microbiology conference in Washington, D.C. "We don't have access to their data. There is absolutely no reciprocity."
Protecting your data
While Mr. You has been stressing the importance of data security to anyone who will listen, the National Academies of Sciences, Engineering, and Medicine, which makes scientific and policy recommendations on issues of national importance, has commissioned a study on "safeguarding the bioeconomy."
In the meantime, Ms. Berger says organizations that deal with people's health data should assess their security risks and identify potential vulnerabilities in their systems.
As for what individuals can do to protect themselves, she urges people to think about the different ways they're sharing healthcare data—such as via mobile health apps and wearables.
"Ask yourself, what's the benefit of sharing this? What are the potential consequences of sharing this?" she says.
Mr. You also cautions people to think twice before taking consumer DNA tests. They may seem harmless, he says, but at the end of the day, most people don't know where their genetic information is going. "If your genetic sequence is taken, once it's gone, it's gone. There's nothing you can do about it."
Genomics has begun its golden age. Just 20 years ago, sequencing a single genome cost nearly $3 billion and took over a decade. Today, the same feat can be achieved for a few hundred dollars and the better part of a day . Suddenly, the prospect of sequencing not just individuals, but whole populations, has become feasible.
The genetic differences between humans may seem meager, only around 0.1 percent of the genome on average, but this variation can have profound effects on an individual's risk of disease, responsiveness to medication, and even the dosage level that would work best.
Already, initiatives like the U.K.'s 100,000 Genomes Project - now expanding to 1 million genomes - and other similarly massive sequencing projects in Iceland and the U.S., have begun collecting population-scale data in order to capture and study this variation.
The resulting data sets are immensely valuable to researchers and drug developers working to design new 'precision' medicines and diagnostics, and to gain insights that may benefit patients. Yet, because the majority of this data comes from developed countries with well-established scientific and medical infrastructure, the data collected so far is heavily biased towards Western populations with largely European ancestry.
This presents a startling and fast-emerging problem: groups that are under-represented in these datasets are likely to benefit less from the new wave of therapeutics, diagnostics, and insights, simply because they were tailored for the genetic profiles of people with European ancestry.
We may indeed be approaching a golden age of genomics-enabled precision medicine. But if the data bias persists then there is a risk, as with most golden ages throughout history, that the benefits will not be equally accessible to all, and existing inequalities will only be exacerbated.
To remedy the situation, a number of initiatives have sprung up to sequence genomes of under-represented groups, adding them to the datasets and ensuring that they too will benefit from the rapidly unfolding genomic revolution.
Global Gene Corp
The idea behind Global Gene Corp was born eight years ago in Harvard when Sumit Jamuar, co-founder and CEO, met up with his two other co-founders, both experienced geneticists, for a coffee.
"They were discussing the limitless applications of understanding your genetic code," said Jamuar, a business executive from New Delhi.
"And so, being a technology enthusiast type, I was excited and I turned to them and said hey, this is incredible! Could you sequence me and give me some insights? And they actually just turned around and said no, because it's not going to be useful for you - there's not enough reference for what a good Sumit looks like."
What started as a curiosity-driven conversation on the power of genomics ended with a commitment to tackle one of the field's biggest roadblocks - its lack of global representation.
Jamuar set out to begin with India, which has about 20 percent of the world's population, including over 4000 different ethnicities, but contributes less than 2 percent of genomic data, he told Leaps.org.
Eight years later, Global Gene Corp's sequencing initiative is well underway, and is the largest in the history of the Indian subcontinent. The program is being carried out in collaboration with biotech giant Regeneron, with support from the Indian government, local communities, and the Indian healthcare ecosystem. In August 2020, Global Gene Corp's work was recognized through the $1 million 2020 Roddenberry award for organizations that advance the vision of 'Star Trek' creator Gene Roddenberry to better humanity.
This problem has already begun to manifest itself in, for example, much higher levels of genetic misdiagnosis among non-Europeans tested for their risk of certain diseases, such as hypertrophic cardiomyopathy - an inherited disease of the heart muscle.
Global Gene Corp also focuses on developing and implementing AI and machine learning tools to make sense of the deluge of genomic data. These tools are increasingly used by both industry and academia to guide future research by identifying particularly promising or clinically interesting genetic variants. But if the underlying data is skewed European, then the effectiveness of the computational analysis - along with the future advances and avenues of research that emerge from it - will be skewed towards Europeans too.
This problem has already begun to manifest itself in, for example, much higher levels of genetic misdiagnosis among non-Europeans tested for their risk of certain diseases, such as hypertrophic cardiomyopathy - an inherited disease of the heart muscle. Most of the genetic variants used in these tests were identified as being causal for the disease from studies of European genomes. However, many of these variants differ both in their distribution and clinical significance across populations, leading to many patients of non-European ancestry receiving false-positive test results - as their benign genetic variants were misclassified as pathogenic. Had even a small number of genomes from other ethnicities been included in the initial studies, these misdiagnoses could have been avoided.
"Unless we have a data set which is unbiased and representative, we're never going to achieve the success that we want," Jamuar says.
"When Siri was first launched, she could hardly recognize an accent which was not of a certain type, so if I was trying to speak to Siri, I would have to repeat myself multiple times and try to mimic an accent which wasn't my accent so that she could understand it.
"But over time the voice recognition technology improved tremendously because the training data was expanded to include people of very diverse backgrounds and their accents, so the algorithms were trained to be able to pick that up and it dramatically improved the technology. That's the way we have to think about it - without that good-quality diverse data, we will never be able to achieve the full potential of the computational tools."
While mapping India's rich genetic diversity has been the organization's primary focus so far, they plan, in time, to expand their work to other under-represented groups in Asia, the Middle East, Africa, and Latin America.
"As other like-minded people and partners join the mission, it just accelerates the achievement of what we have set out to do, which is to map out and organize the world's genomic diversity so that we can enable high-quality life and longevity benefits for everyone, everywhere," Jamuar says.
Empowering African Genomics
Africa is the birthplace of our species, and today still retains an inordinate amount of total human genetic diversity. Groups that left Africa and went on to populate the rest of the world, some 50 to 100,000 years ago, were likely small in number and only took a fraction of the total genetic diversity with them. This ancient bottleneck means that no other group in the world can match the level of genetic diversity seen in modern African populations.
Despite Africa's central importance in understanding the history and extent of human genetic diversity, the genomics of African populations remains wildly understudied. Addressing this disparity has become a central focus of the H3Africa Consortium, an initiative formally launched in 2012 with support from the African Academy of Sciences, the U.S. National Institutes of Health, and the UK's Wellcome Trust. Today, H3Africa supports over 50 projects across the continent, on an array of different research areas in genetics relevant to the health and heredity of Africans.
"Africa is the cradle of Humankind. So what that really means is that the populations that are currently living in Africa are among some of the oldest populations on the globe, and we know that the longer populations have had to go through evolutionary phases, the more variation there is in the genomes of people who live presently," says Zane Lombard, a principal investigator at H3Africa and Associate Professor of Human Genetics at the University of the Witwatersrand in Johannesburg, South Africa.
"So for that reason, African populations carry a huge amount of genetic variation and diversity, which is pretty much uncaptured. There's still a lot to learn as far as novel variation is concerned by looking at and studying African genomes."
A recent landmark H3Africa study, led by Lombard and published in Nature in October, sequenced the genomes of over 400 African individuals from 50 ethno-linguistic groups - many of which had never been sampled before.
Despite the relatively modest number of individuals sequenced in the study, over three million previously undescribed genetic variants were found, and complex patterns of ancestral migration were uncovered.
"In some of these ethno-linguistic groups they don't have a word for DNA, so we've had to really think about how to make sure that we communicate the purposes of different studies to participants so that you have true informed consent," says Lombard.
"The objective," she explained, "was to try and fill some of the gaps for many of these populations for which we didn't have any whole genome sequences or any genetic variation data...because if we're thinking about the future of precision medicine, if the patient is a member of a specific group where we don't know a lot about the genomic variation that exists in that group, it makes it really difficult to start thinking about clinical interpretation of their data."
From H3Africa's conception, the consortium's goal has not only been to better represent Africa's staggering genetic diversity in genomic data sets, but also to build Africa's domestic genomics capabilities and empower a new generation of African researchers. By doing so, the hope is that Africans will be able to set their own genomics agenda, and leapfrog to new and better ways of doing the work.
"The training that has happened on the continent and the number of new scientists, new students, and fellows that have come through the process and are now enabled to start their own research groups, to grow their own research in their countries, to be a spokesperson for genomics research in their countries, and to build that political will to do these larger types of sequencing initiatives - that is really a significant outcome from H3Africa as well. Over and above all the science that's coming out," Lombard says.
"What has been created through H3Africa is just this locus of researchers and scientists and bioethicists who have the same goal at heart - to work towards adjusting the data bias and making sure that all global populations are represented in genomics."
Jurassic Park Without the Scary Parts: How Stem Cells May Rescue the Near-Extinct Rhinoceros
I am a stem cell scientist. In my day job I work on developing ways to use stem cells to treat neurological disease – human disease. This is the story about how I became part of a group dedicated to rescuing the northern white rhinoceros from extinction.
The earth is now in an era that is called the "sixth mass extinction." The first extinction, 400 million years ago, put an end to 86 percent of the existing species, including most of the trilobites. When the earth grew hotter, dustier, or darker, it lost fish, amphibians, reptiles, plants, dinosaurs, mammals and birds. Each extinction event wiped out 80 to 90 percent of the life on the planet at the time. The first 5 mass extinctions were caused by natural disasters: volcanoes, fires, a meteor. But humans can take credit for the 6th.
Because of human activities that destroy habitats, creatures are now becoming extinct at a rate that is higher than any previously experienced. Some animals, like the giant panda and the California condor, have been pulled back from the brink of extinction by conserving their habitats, breeding in captivity, and educating the public about their plight.
But not the northern white rhino. This gentle giant is a vegetarian that can weigh up to 5,000 pounds. The rhino's weakness is its horn, which has become a valuable commodity because of the mistaken idea that it grants power and has medicinal value. Horns are not medicine; the horns are made of keratin, the same protein that is in fingernails. But as recently as 2017 more than 1,000 rhinos were slaughtered each year to harvest their horns.
All 6 rhino species are endangered. But the northern white has been devastated. Only two members of this species are alive now: Najin, age 32, and her daughter Fatu, 21, live in a protected park in Kenya. They are social animals and would prefer the company of other rhinos of their kind; but they can't know that they are the last two survivors of their entire species. No males exist anymore. The last male, Sudan, died in 2018 at age 45.
We are celebrating a huge milestone in the efforts to use stem cells to rescue the rhino.
I became involved in the rhino rescue project on a sunny day in February, 2008 at the San Diego Wild Animal Park in Escondido, about 30 miles north of my lab in La Jolla. My lab had relocated a couple of months earlier to Scripps Research Institute to start the Center for Regenerative Medicine for human stem cell research. To thank my staff for their hard work, I wanted to arrange a special treat. I contacted my friend Oliver Ryder, who is director of the Institute for Conservation Research at the zoo, to see if I could take them on a safari, a tour in a truck through the savanna habitat at the park.
This was the first of the "stem cell safaris" that the lab would enjoy over the next few years. On the safari we saw elands and cape buffalo, and fed giraffes and rhinos. And we talked about stem cells; in particular, we discussed a surprising technological breakthrough recently reported by the Japanese scientist Shinya Yamanaka that enabled conversion of ordinary skin cells into pluripotent stem cells.
Pluripotent stem cells can develop into virtually any cell type in the body. They exist when we are very young embryos; five days after we were just fertilized eggs, we became blastocysts, invisible tiny balls of a few hundred cells packed with the power to develop into an entire human being. Long before we are born, these cells of vast potential transform into highly specialized cells that generate our brains, our hearts, and everything else.
Human pluripotent stem cells from blastocysts can be cultured in the lab, and are called embryonic stem cells. But thanks to Dr. Yamanaka, anyone can have their skin cells reprogrammed into pluripotent stem cells, just like the ones we had when we were embryos. Dr. Yamanaka won the Nobel Prize for these cells, called "induced pluripotent stem cells" (iPSCs) several years later.
On our safari we realized that if we could make these reprogrammed stem cells from human skin cells, why couldn't we make them from animals' cells? How about endangered animals? Could such stem cells be made from animals whose skin cells had been being preserved since the 1970s in the San Diego Zoo's Frozen Zoo®? Our safari leader, Oliver Ryder, was the curator of the Frozen Zoo and knew what animal cells were stored in its giant liquid nitrogen tanks at −196°C (-320° F). The Frozen Zoo was established by Dr. Kurt Benirschke in 1975 in the hope that someday the collection would aid in rescue of animals that were on the brink of extinction. The frozen collection reached 10,000 cell lines this year.
We returned to the lab after the safari, and I asked my scientists if any of them would like to take on the challenge of making reprogrammed stem cells from endangered species. My new postdoctoral fellow, Inbar Friedrich Ben-Nun, raised her hand. Inbar had arrived only a few weeks earlier from Israel, and she was excited about doing something that had never been done before. Oliver picked the animals we would use. He chose his favorite animal, the critically endangered northern white rhinoceros, and the drill, which is an endangered primate related to the mandrill monkey,
When Inbar started work on reprogramming cells from the Frozen Zoo, there were 8 living northern rhinoceros around the world: Nola, Angalifu, Nesari, Nabire, Suni, Sudan, Najin, and Fatu. We chose to reprogram Fatu, the youngest of the remaining animals.
Through sheer determination and trial and error, Inbar got the reprogramming technique to work, and in 2011 we published the first report of iPSCs from endangered species in the scientific journal Nature Methods. The cover of the journal featured a drawing of an ark packed with animals that might someday be rescued through iPSC technology. By 2011, one of the 8 rhinos, Nesari, had died.
This kernel of hope for using iPSCs to rescue rhinos grew over the next 10 years. The zoo built the Rhino Rescue Center, and brought in 6 females of the closely related species, the southern white rhinoceros, from Africa. Southern white rhino populations are on the rise, and it appears that this species will survive, at least in captivity. The females are destined to be surrogate mothers for embryos made from northern white rhino cells, when eventually we hope to generate sperm and eggs from the reprogrammed stem cells, and fertilize the eggs in vitro, much the same as human IVF.
The author, Jeanne Loring, at the Rhino Rescue Center with one of the southern white rhino surrogates.
David Barker
As this project has progressed, we've been saddened by the loss of all but the last two remaining members of the species. Nola, the last northern white rhino in the U.S., who was at the San Diego Zoo, died in 2015.
But we are celebrating a huge milestone in the efforts to use stem cells to rescue the rhino. Just over a month ago, we reported that by reprogramming cells preserved in the Frozen Zoo, we produced iPSCs from stored cells of 9 northern white rhinos: Fatu, Najin, Nola, Suni, Nadi, Dinka, Nasima, Saut, and Angalifu. We also reprogrammed cells from two of the southern white females, Amani and Wallis.
We don't know when it will be possible to make a northern white rhino embryo; we have to figure out how to use methods already developed for laboratory mice to generate sperm and eggs from these cells. The male rhino Angalifu died in 2014, but ever since I saw beating heart cells derived from his very own cells in a culture dish, I've felt hope that he will one day have children who will seed a thriving new herd of northern white rhinos.