Genomic Data Has a Diversity Problem, But Global Efforts Are Underway to Fix It
Genomics has begun its golden age. Just 20 years ago, sequencing a single genome cost nearly $3 billion and took over a decade. Today, the same feat can be achieved for a few hundred dollars and the better part of a day . Suddenly, the prospect of sequencing not just individuals, but whole populations, has become feasible.
The genetic differences between humans may seem meager, only around 0.1 percent of the genome on average, but this variation can have profound effects on an individual's risk of disease, responsiveness to medication, and even the dosage level that would work best.
Already, initiatives like the U.K.'s 100,000 Genomes Project - now expanding to 1 million genomes - and other similarly massive sequencing projects in Iceland and the U.S., have begun collecting population-scale data in order to capture and study this variation.
The resulting data sets are immensely valuable to researchers and drug developers working to design new 'precision' medicines and diagnostics, and to gain insights that may benefit patients. Yet, because the majority of this data comes from developed countries with well-established scientific and medical infrastructure, the data collected so far is heavily biased towards Western populations with largely European ancestry.
This presents a startling and fast-emerging problem: groups that are under-represented in these datasets are likely to benefit less from the new wave of therapeutics, diagnostics, and insights, simply because they were tailored for the genetic profiles of people with European ancestry.
We may indeed be approaching a golden age of genomics-enabled precision medicine. But if the data bias persists then there is a risk, as with most golden ages throughout history, that the benefits will not be equally accessible to all, and existing inequalities will only be exacerbated.
To remedy the situation, a number of initiatives have sprung up to sequence genomes of under-represented groups, adding them to the datasets and ensuring that they too will benefit from the rapidly unfolding genomic revolution.
Global Gene Corp
The idea behind Global Gene Corp was born eight years ago in Harvard when Sumit Jamuar, co-founder and CEO, met up with his two other co-founders, both experienced geneticists, for a coffee.
"They were discussing the limitless applications of understanding your genetic code," said Jamuar, a business executive from New Delhi.
"And so, being a technology enthusiast type, I was excited and I turned to them and said hey, this is incredible! Could you sequence me and give me some insights? And they actually just turned around and said no, because it's not going to be useful for you - there's not enough reference for what a good Sumit looks like."
What started as a curiosity-driven conversation on the power of genomics ended with a commitment to tackle one of the field's biggest roadblocks - its lack of global representation.
Jamuar set out to begin with India, which has about 20 percent of the world's population, including over 4000 different ethnicities, but contributes less than 2 percent of genomic data, he told Leaps.org.
Eight years later, Global Gene Corp's sequencing initiative is well underway, and is the largest in the history of the Indian subcontinent. The program is being carried out in collaboration with biotech giant Regeneron, with support from the Indian government, local communities, and the Indian healthcare ecosystem. In August 2020, Global Gene Corp's work was recognized through the $1 million 2020 Roddenberry award for organizations that advance the vision of 'Star Trek' creator Gene Roddenberry to better humanity.
This problem has already begun to manifest itself in, for example, much higher levels of genetic misdiagnosis among non-Europeans tested for their risk of certain diseases, such as hypertrophic cardiomyopathy - an inherited disease of the heart muscle.
Global Gene Corp also focuses on developing and implementing AI and machine learning tools to make sense of the deluge of genomic data. These tools are increasingly used by both industry and academia to guide future research by identifying particularly promising or clinically interesting genetic variants. But if the underlying data is skewed European, then the effectiveness of the computational analysis - along with the future advances and avenues of research that emerge from it - will be skewed towards Europeans too.
This problem has already begun to manifest itself in, for example, much higher levels of genetic misdiagnosis among non-Europeans tested for their risk of certain diseases, such as hypertrophic cardiomyopathy - an inherited disease of the heart muscle. Most of the genetic variants used in these tests were identified as being causal for the disease from studies of European genomes. However, many of these variants differ both in their distribution and clinical significance across populations, leading to many patients of non-European ancestry receiving false-positive test results - as their benign genetic variants were misclassified as pathogenic. Had even a small number of genomes from other ethnicities been included in the initial studies, these misdiagnoses could have been avoided.
"Unless we have a data set which is unbiased and representative, we're never going to achieve the success that we want," Jamuar says.
"When Siri was first launched, she could hardly recognize an accent which was not of a certain type, so if I was trying to speak to Siri, I would have to repeat myself multiple times and try to mimic an accent which wasn't my accent so that she could understand it.
"But over time the voice recognition technology improved tremendously because the training data was expanded to include people of very diverse backgrounds and their accents, so the algorithms were trained to be able to pick that up and it dramatically improved the technology. That's the way we have to think about it - without that good-quality diverse data, we will never be able to achieve the full potential of the computational tools."
While mapping India's rich genetic diversity has been the organization's primary focus so far, they plan, in time, to expand their work to other under-represented groups in Asia, the Middle East, Africa, and Latin America.
"As other like-minded people and partners join the mission, it just accelerates the achievement of what we have set out to do, which is to map out and organize the world's genomic diversity so that we can enable high-quality life and longevity benefits for everyone, everywhere," Jamuar says.
Empowering African Genomics
Africa is the birthplace of our species, and today still retains an inordinate amount of total human genetic diversity. Groups that left Africa and went on to populate the rest of the world, some 50 to 100,000 years ago, were likely small in number and only took a fraction of the total genetic diversity with them. This ancient bottleneck means that no other group in the world can match the level of genetic diversity seen in modern African populations.
Despite Africa's central importance in understanding the history and extent of human genetic diversity, the genomics of African populations remains wildly understudied. Addressing this disparity has become a central focus of the H3Africa Consortium, an initiative formally launched in 2012 with support from the African Academy of Sciences, the U.S. National Institutes of Health, and the UK's Wellcome Trust. Today, H3Africa supports over 50 projects across the continent, on an array of different research areas in genetics relevant to the health and heredity of Africans.
"Africa is the cradle of Humankind. So what that really means is that the populations that are currently living in Africa are among some of the oldest populations on the globe, and we know that the longer populations have had to go through evolutionary phases, the more variation there is in the genomes of people who live presently," says Zane Lombard, a principal investigator at H3Africa and Associate Professor of Human Genetics at the University of the Witwatersrand in Johannesburg, South Africa.
"So for that reason, African populations carry a huge amount of genetic variation and diversity, which is pretty much uncaptured. There's still a lot to learn as far as novel variation is concerned by looking at and studying African genomes."
A recent landmark H3Africa study, led by Lombard and published in Nature in October, sequenced the genomes of over 400 African individuals from 50 ethno-linguistic groups - many of which had never been sampled before.
Despite the relatively modest number of individuals sequenced in the study, over three million previously undescribed genetic variants were found, and complex patterns of ancestral migration were uncovered.
"In some of these ethno-linguistic groups they don't have a word for DNA, so we've had to really think about how to make sure that we communicate the purposes of different studies to participants so that you have true informed consent," says Lombard.
"The objective," she explained, "was to try and fill some of the gaps for many of these populations for which we didn't have any whole genome sequences or any genetic variation data...because if we're thinking about the future of precision medicine, if the patient is a member of a specific group where we don't know a lot about the genomic variation that exists in that group, it makes it really difficult to start thinking about clinical interpretation of their data."
From H3Africa's conception, the consortium's goal has not only been to better represent Africa's staggering genetic diversity in genomic data sets, but also to build Africa's domestic genomics capabilities and empower a new generation of African researchers. By doing so, the hope is that Africans will be able to set their own genomics agenda, and leapfrog to new and better ways of doing the work.
"The training that has happened on the continent and the number of new scientists, new students, and fellows that have come through the process and are now enabled to start their own research groups, to grow their own research in their countries, to be a spokesperson for genomics research in their countries, and to build that political will to do these larger types of sequencing initiatives - that is really a significant outcome from H3Africa as well. Over and above all the science that's coming out," Lombard says.
"What has been created through H3Africa is just this locus of researchers and scientists and bioethicists who have the same goal at heart - to work towards adjusting the data bias and making sure that all global populations are represented in genomics."
Breakthrough therapies are breaking patients' banks. Key changes could improve access, experts say.
CSL Behring’s new gene therapy for hemophilia, Hemgenix, costs $3.5 million for one treatment, but helps the body create substances that allow blood to clot. It appears to be a cure, eliminating the need for other treatments for many years at least.
Likewise, Novartis’s Kymriah mobilizes the body’s immune system to fight B-cell lymphoma, but at a cost $475,000. For patients who respond, it seems to offer years of life without the cancer progressing.
These single-treatment therapies are at the forefront of a new, bold era of medicine. Unfortunately, they also come with new, bold prices that leave insurers and patients wondering whether they can afford treatment and, if they can, whether the high costs are worthwhile.
“Most pharmaceutical leaders are there to improve and save people’s lives,” says Jeremy Levin, chairman and CEO of Ovid Therapeutics, and immediate past chairman of the Biotechnology Innovation Organization. If the therapeutics they develop are too expensive for payers to authorize, patients aren’t helped.
“The right to receive care and the right of pharmaceuticals developers to profit should never be at odds,” Levin stresses. And yet, sometimes they are.
Leigh Turner, executive director of the bioethics program, University of California, Irvine, notes this same tension between drug developers that are “seeking to maximize profits by charging as much as the market will bear for cell and gene therapy products and other medical interventions, and payers trying to control costs while also attempting to provide access to medical products with promising safety and efficacy profiles.”
Why Payers Balk
Health insurers can become skittish around extremely high prices, yet these therapies often accompany significant overall savings. For perspective, the estimated annual treatment cost for hemophilia exceeds $300,000. With Hemgenix, payers would break even after about 12 years.
But, in 12 years, will the patient still have that insurer? Therein lies the rub. U.S. payers, are used to a “pay-as-you-go” model, in which the lifetime costs of therapies typically are shared by multiple payers over many years, as patients change jobs. Single treatment therapeutics eliminate that cost-sharing ability.
"As long as formularies are based on profits to middlemen…Americans’ healthcare costs will continue to skyrocket,” says Patricia Goldsmith, the CEO of CancerCare.
“There is a phenomenally complex, bureaucratic reimbursement system that has grown, layer upon layer, during several decades,” Levin says. As medicine has innovated, payment systems haven’t kept up.
Therefore, biopharma companies begin working with insurance companies and their pharmacy benefit managers (PBMs), which act on an insurer’s behalf to decide which drugs to cover and by how much, early in the drug approval process. Their goal is to make sophisticated new drugs available while still earning a return on their investment.
New Payment Models
Pay-for-performance is one increasingly popular strategy, Turner says. “These models typically link payments to evidence generation and clinically significant outcomes.”
A biotech company called bluebird bio, for example, offers value-based pricing for Zynteglo, a $2.8 million possible cure for the rare blood disorder known as beta thalassaemia. It generally eliminates patients’ need for blood transfusions. The company is so sure it works that it will refund 80 percent of the cost of the therapy if patients need blood transfusions related to that condition within five years of being treated with Zynteglo.
In his February 2023 State of the Union speech, President Biden proposed three pilot programs to reduce drug costs. One of them, the Cell and Gene Therapy Access Model calls on the federal Centers for Medicare & Medicaid Services to establish outcomes-based agreements with manufacturers for certain cell and gene therapies.
A mortgage-style payment system is another, albeit rare, approach. Amortized payments spread the cost of treatments over decades, and let people change employers without losing their healthcare benefits.
Only about 14 percent of all drugs that enter clinical trials are approved by the FDA. Pharma companies, therefore, have an exigent need to earn a profit.
The new payment models that are being discussed aren’t solutions to high prices, says Bill Kramer, senior advisor for health policy at Purchaser Business Group on Health (PBGH), a nonprofit that seeks to lower health care costs. He points out that innovative pricing models, although well-intended, may distract from the real problem of high prices. They are attempts to “soften the blow. The best thing would be to charge a reasonable price to begin with,” he says.
Instead, he proposes making better use of research on cost and clinical effectiveness. The Institute for Clinical and Economic Review (ICER) conducts such research in the U.S., determining whether the benefits of specific drugs justify their proposed prices. ICER is an independent non-profit research institute. Its reports typically assess the degrees of improvement new therapies offer and suggest prices that would reflect that. “Publicizing that data is very important,” Kramer says. “Their results aren’t used to the extent they could and should be.” Pharmaceutical companies tend to price their therapies higher than ICER’s recommendations.
Drug Development Costs Soar
Drug developers have long pointed to the onerous costs of drug development as a reason for high prices.
A 2020 study found the average cost to bring a drug to market exceeded $1.1 billion, while other studies have estimated overall costs as high as $2.6 billion. The development timeframe is about 10 years. That’s because modern therapeutics target precise mechanisms to create better outcomes, but also have high failure rates. Only about 14 percent of all drugs that enter clinical trials are approved by the FDA. Pharma companies, therefore, have an exigent need to earn a profit.
Skewed Incentives Increase Costs
Pricing isn’t solely at the discretion of pharma companies, though. “What patients end up paying has much more to do with their PBMs than the actual price of the drug,” Patricia Goldsmith, CEO, CancerCare, says. Transparency is vital.
PBMs control patients’ access to therapies at three levels, through price negotiations, pricing tiers and pharmacy management.
When negotiating with drug manufacturers, Goldsmith says, “PBMs exchange a preferred spot on a formulary (the insurer’s or healthcare provider’s list of acceptable drugs) for cash-base rebates.” Unfortunately, 25 percent of the time, those rebates are not passed to insurers, according to the PBGH report.
Then, PBMs use pricing tiers to steer patients and physicians to certain drugs. For example, Kramer says, “Sometimes PBMs put a high-cost brand name drug in a preferred tier and a lower-cost competitor in a less preferred, higher-cost tier.” As the PBGH report elaborates, “(PBMs) are incentivized to include the highest-priced drugs…since both manufacturing rebates, as well as the administrative fees they charge…are calculated as a percentage of the drug’s price.
Finally, by steering patients to certain pharmacies, PBMs coordinate patients’ access to treatments, control patients’ out-of-pocket costs and receive management fees from the pharmacies.
Therefore, Goldsmith says, “As long as formularies are based on profits to middlemen…Americans’ healthcare costs will continue to skyrocket.”
Transparency into drug pricing will help curb costs, as will new payment strategies. What will make the most impact, however, may well be the development of a new reimbursement system designed to handle dramatic, breakthrough drugs. As Kramer says, “We need a better system to identify drugs that offer dramatic improvements in clinical care.”
Each afternoon, kids walk through my neighborhood, on their way back home from school, and almost all of them are walking alone, staring down at their phones. It's a troubling site. This daily parade of the zombie children just can’t bode well for the future.
That’s one reason I felt like Gaia Bernstein’s new book was talking directly to me. A law professor at Seton Hall, Gaia makes a strong argument that people are so addicted to tech at this point, we need some big, system level changes to social media platforms and other addictive technologies, instead of just blaming the individual and expecting them to fix these issues.
Gaia’s book is called Unwired: Gaining Control Over Addictive Technologies. It’s fascinating and I had a chance to talk with her about it for today’s podcast. At its heart, our conversation is really about how and whether we can maintain control over our thoughts and actions, even when some powerful forces are pushing in the other direction.
Listen on Apple | Listen on Spotify | Listen on Stitcher | Listen on Amazon | Listen on Google
We discuss the idea that, in certain situations, maybe it's not reasonable to expect that we’ll be able to enjoy personal freedom and autonomy. We also talk about how to be a good parent when it sometimes seems like our kids prefer to be raised by their iPads; so-called educational video games that actually don’t have anything to do with education; the root causes of tech addictions for people of all ages; and what kinds of changes we should be supporting.
Gaia is Seton’s Hall’s Technology, Privacy and Policy Professor of Law, as well as Co-Director of the Institute for Privacy Protection, and Co-Director of the Gibbons Institute of Law Science and Technology. She’s the founding director of the Institute for Privacy Protection. She created and spearheaded the Institute’s nationally recognized Outreach Program, which educated parents and students about technology overuse and privacy.
Professor Bernstein's scholarship has been published in leading law reviews including the law reviews of Vanderbilt, Boston College, Boston University, and U.C. Davis. Her work has been selected to the Stanford-Yale Junior Faculty Forum and received extensive media coverage. Gaia joined Seton Hall's faculty in 2004. Before that, she was a fellow at the Engelberg Center of Innovation Law & Policy and at the Information Law Institute of the New York University School of Law. She holds a J.S.D. from the New York University School of Law, an LL.M. from Harvard Law School, and a J.D. from Boston University.
Gaia’s work on this topic is groundbreaking I hope you’ll listen to the conversation and then consider pre-ordering her new book. It comes out on March 28.