Is It Possible to Predict Your Face, Voice, and Skin Color from Your DNA?
Renowned genetics pioneer Dr. J Craig Venter is no stranger to controversy.
Back in 2000, he famously raced the public Human Genome Project to decode all three billion letters of the human genome for the first time. A decade later, he ignited a new debate when his team created a bacterial cell with a synthesized genome.
Most recently, he's jumped back into the fray with a study in the September issue of the Proceedings of the National Academy of Sciences about the predictive potential of genomic data to identify individual traits such as voice, facial structure and skin color.
The new study raises significant questions about the privacy of genetic data.
His study applied whole-genome sequencing and statistical modeling to predict traits in 1,061 people of diverse ancestry. His approach aimed to reconstruct a person's physical characteristics based on DNA, and 74 percent of the time, his algorithm could correctly identify the individual in a random lineup of 10 people from his company's database.
While critics have been quick to cast doubt on the plausibility of his claims, the ability to discern people's observable traits, or phenotypes, from their genomes may grow more precise as technology improves, raising significant questions about the privacy and usage of genetic information in the long term.
J. Craig Venter showing slides from his recent study on facial prediction at the Summit Conference in Los Angeles on Nov. 3, 2017.
(Courtesy of Kira Peikoff)
Critics: Study Was Incomplete, Problematic
Before even redressing these potential legal and ethical considerations, some scientists simply said the study's main result was invalid. They pointed out that the methodology worked much better in distinguishing between people of different ethnicities than those of the same ethnicity. One of the most outspoken critics, Yaniv Erlich, a geneticist at Columbia University, said, "The method doesn't work. The results were like, 'If you have a lineup of ten people, you can predict eight."
Erlich, who reviewed Venter's paper for Science, where it was rejected, said that he came up with the same results—correctly predicting eight of ten people—by just looking at demographic factors such as age, gender and ethnicity. He added that Venter's recent rebuttal to his criticism was that 'Once we have thousands of phenotypes, it might work better.' But that, Erlich argued, would be "a major breach of privacy. Nobody has thousands of phenotypes for people."
Other critics suggested that the study's results discourage the sharing of genetic data, which is becoming increasingly important for medical research. They go one step further and imply that people's possible hesitation to share their genetic information in public databases may actually play into Venter's hands.
Venter's own company, Human Longevity Inc., aims to build the world's most comprehensive private database on human genotypes and phenotypes. The vastness of this information stands to improve the accuracy of whole genome and microbiome sequencing for individuals—analyses that come at a hefty price tag. Today, Human Longevity Inc. will sequence your genome and perform a battery of other health-related tests at an entry cost of $4900, going up to $25,000. Venter initially agreed to comment for this article, but then could not be reached.
"The bigger issue is how do we understand and use genetic information and avoid harming people."
Opens Up Pandora's Box of Ethical Issues
Whether Venter's study is valid may not be as important as the Pandora's box of potential ethical and legal issues that it raises for future consideration. "I think this story is one along a continuum of stories we've had on the issue of identifiability based on genomic information in the past decade," said Amy McGuire, a biomedical ethics professor at Baylor College of Medicine. "It does raise really interesting and important questions about privacy, and socially, how we respond to these types of scientific advancements. A lot of our focus from a policy and ethics perspective is to protect privacy."
McGuire, who is also the Director of the Center for Medical Ethics and Health Policy at Baylor, added that while protecting privacy is very important, "the bigger issue is how do we understand and use genetic information and avoid harming people." While we've taken "baby steps," she said, towards enacting laws in the U.S. that fight genetic determinism—such as the Genetic Information and Nondiscrimination Act, which prohibits discrimination based on genetic information in health insurance and employment—some areas remain unprotected, such as for life insurance and disability.
J. Craig Venter showing slides from his recent study on facial prediction at the Summit Conference in Los Angeles on Nov. 3, 2017.
(Courtesy of Kira Peikoff)
Physical reconstructions like those in Venter's study could also be inappropriately used by law enforcement, said Leslie Francis, a law and philosophy professor at the University of Utah, who has written about the ethical and legal issues related to sharing genomic data.
"If [Venter's] findings, or findings like them, hold up, the implications would be significant," Francis said. Law enforcement is increasingly using DNA identification from genetic material left at crime scenes to weed out innocent and guilty suspects, she explained. This adds another potentially complicating layer.
"There is a shift here, from using DNA sequencing techniques to match other DNA samples—as when semen obtained from a rape victim is then matched (or not) with a cheek swab from a suspect—to using DNA sequencing results to predict observable characteristics," Francis said. She added that while the former necessitates having an actual DNA sample for a match, the latter can use DNA to pre-emptively (and perhaps inaccurately) narrow down suspects.
"My worry is that if this [the study's methodology] turns out to be sort-of accurate, people will think it is better than what it is," said Francis. "If law enforcement comes to rely on it, there will be a host of false positives and false negatives. And we'll face new questions, [such as] 'Which is worse? Picking an innocent as guilty, or failing to identify someone who is guilty?'"
Risking Privacy Involves a Tradeoff
When people voluntarily risk their own privacy, that involves a tradeoff, McGuire said. A 2014 study that she conducted among people who were very sick, or whose children were very sick, found that more than half were willing to share their health information, despite concerns about privacy, because they saw a big benefit in advancing research on their conditions.
"We've focused a lot of our policy attention on restricting access, but we don't have a system of accountability when there's a breach."
"To make leaps and bounds in medicine and genomics, we need to create a database of millions of people signing on to share their genetic and health information in order to improve research and clinical care," McGuire said. "They are going to risk their privacy, and we have a social obligation to protect them."
That also means "punishing bad actors," she continued. "We've focused a lot of our policy attention on restricting access, but we don't have a system of accountability when there's a breach."
Even though most people using genetic information have good intentions, the consequences if not are troubling. "All you need is one bad actor who decimates the trust in the system, and it has catastrophic consequences," she warned. That hasn't happened on a massive scale yet, and even if it did, some experts argue that obtaining the data is not the real risk; what is more concerning is hacking individuals' genetic information to be used against them, such as to prove someone is unfit for a particular job because of a genetic condition like Alzheimer's, or that a parent is unfit for custody because of a genetic disposition to mental illness.
Venter, in fact, told an audience at the recent Summit conference in Los Angeles that his new study's approach could not only predict someone's physical appearance from their DNA, but also some of their psychological traits, such as the propensity for an addictive personality. In the future, he said, it will be possible to predict even more about mental health from the genome.
What is most at risk on a massive scale, however, is not so much genetic information as demographic identifiers included in medical records, such as birth dates and social security numbers, said Francis, the law and philosophy professor. "The much more interesting and lucrative security breaches typically involve not people interested in genetic information per se, but people interested in the information in health records that you can't change."
Hospitals have been hacked for this kind of information, including an incident at the Veterans Administration in 2006, in which the laptop and external hard drive of an agency employee that contained unencrypted information on 26.5 million patients were stolen from the employee's house.
So, what can people do to protect themselves? "Don't share anything you wouldn't want the world to see," Francis said. "And don't click 'I agree' without actually reading privacy policies or terms and conditions. They may surprise you."
DNA- and RNA-based electronic implants may revolutionize healthcare
Implantable electronic devices can significantly improve patients’ quality of life. A pacemaker can encourage the heart to beat more regularly. A neural implant, usually placed at the back of the skull, can help brain function and encourage higher neural activity. Current research on neural implants finds them helpful to patients with Parkinson’s disease, vision loss, hearing loss, and other nerve damage problems. Several of these implants, such as Elon Musk’s Neuralink, have already been approved by the FDA for human use.
Yet, pacemakers, neural implants, and other such electronic devices are not without problems. They require constant electricity, limited through batteries that need replacements. They also cause scarring. “The problem with doing this with electronics is that scar tissue forms,” explains Kate Adamala, an assistant professor of cell biology at the University of Minnesota Twin Cities. “Anytime you have something hard interacting with something soft [like muscle, skin, or tissue], the soft thing will scar. That's why there are no long-term neural implants right now.” To overcome these challenges, scientists are turning to biocomputing processes that use organic materials like DNA and RNA. Other promised benefits include “diagnostics and possibly therapeutic action, operating as nanorobots in living organisms,” writes Evgeny Katz, a professor of bioelectronics at Clarkson University, in his book DNA- And RNA-Based Computing Systems.
While a computer gives these inputs in binary code or "bits," such as a 0 or 1, biocomputing uses DNA strands as inputs, whether double or single-stranded, and often uses fluorescent RNA as an output.
Adamala’s research focuses on developing such biocomputing systems using DNA, RNA, proteins, and lipids. Using these molecules in the biocomputing systems allows the latter to be biocompatible with the human body, resulting in a natural healing process. In a recent Nature Communications study, Adamala and her team created a new biocomputing platform called TRUMPET (Transcriptional RNA Universal Multi-Purpose GatE PlaTform) which acts like a DNA-powered computer chip. “These biological systems can heal if you design them correctly,” adds Adamala. “So you can imagine a computer that will eventually heal itself.”
The basics of biocomputing
Biocomputing and regular computing have many similarities. Like regular computing, biocomputing works by running information through a series of gates, usually logic gates. A logic gate works as a fork in the road for an electronic circuit. The input will travel one way or another, giving two different outputs. An example logic gate is the AND gate, which has two inputs (A and B) and two different results. If both A and B are 1, the AND gate output will be 1. If only A is 1 and B is 0, the output will be 0 and vice versa. If both A and B are 0, the result will be 0. While a computer gives these inputs in binary code or "bits," such as a 0 or 1, biocomputing uses DNA strands as inputs, whether double or single-stranded, and often uses fluorescent RNA as an output. In this case, the DNA enters the logic gate as a single or double strand.
If the DNA is double-stranded, the system “digests” the DNA or destroys it, which results in non-fluorescence or “0” output. Conversely, if the DNA is single-stranded, it won’t be digested and instead will be copied by several enzymes in the biocomputing system, resulting in fluorescent RNA or a “1” output. And the output for this type of binary system can be expanded beyond fluorescence or not. For example, a “1” output might be the production of the enzyme insulin, while a “0” may be that no insulin is produced. “This kind of synergy between biology and computation is the essence of biocomputing,” says Stephanie Forrest, a professor and the director of the Biodesign Center for Biocomputing, Security and Society at Arizona State University.
Biocomputing circles are made of DNA, RNA, proteins and even bacteria.
Evgeny Katz
The TRUMPET’s promise
Depending on whether the biocomputing system is placed directly inside a cell within the human body, or run in a test-tube, different environmental factors play a role. When an output is produced inside a cell, the cell's natural processes can amplify this output (for example, a specific protein or DNA strand), creating a solid signal. However, these cells can also be very leaky. “You want the cells to do the thing you ask them to do before they finish whatever their businesses, which is to grow, replicate, metabolize,” Adamala explains. “However, often the gate may be triggered without the right inputs, creating a false positive signal. So that's why natural logic gates are often leaky." While biocomputing outside a cell in a test tube can allow for tighter control over the logic gates, the outputs or signals cannot be amplified by a cell and are less potent.
TRUMPET, which is smaller than a cell, taps into both cellular and non-cellular biocomputing benefits. “At its core, it is a nonliving logic gate system,” Adamala states, “It's a DNA-based logic gate system. But because we use enzymes, and the readout is enzymatic [where an enzyme replicates the fluorescent RNA], we end up with signal amplification." This readout means that the output from the TRUMPET system, a fluorescent RNA strand, can be replicated by nearby enzymes in the platform, making the light signal stronger. "So it combines the best of both worlds,” Adamala adds.
These organic-based systems could detect cancer cells or low insulin levels inside a patient’s body.
The TRUMPET biocomputing process is relatively straightforward. “If the DNA [input] shows up as single-stranded, it will not be digested [by the logic gate], and you get this nice fluorescent output as the RNA is made from the single-stranded DNA, and that's a 1,” Adamala explains. "And if the DNA input is double-stranded, it gets digested by the enzymes in the logic gate, and there is no RNA created from the DNA, so there is no fluorescence, and the output is 0." On the story's leading image above, if the tube is "lit" with a purple color, that is a binary 1 signal for computing. If it's "off" it is a 0.
While still in research, TRUMPET and other biocomputing systems promise significant benefits to personalized healthcare and medicine. These organic-based systems could detect cancer cells or low insulin levels inside a patient’s body. The study’s lead author and graduate student Judee Sharon is already beginning to research TRUMPET's ability for earlier cancer diagnoses. Because the inputs for TRUMPET are single or double-stranded DNA, any mutated or cancerous DNA could theoretically be detected from the platform through the biocomputing process. Theoretically, devices like TRUMPET could be used to detect cancer and other diseases earlier.
Adamala sees TRUMPET not only as a detection system but also as a potential cancer drug delivery system. “Ideally, you would like the drug only to turn on when it senses the presence of a cancer cell. And that's how we use the logic gates, which work in response to inputs like cancerous DNA. Then the output can be the production of a small molecule or the release of a small molecule that can then go and kill what needs killing, in this case, a cancer cell. So we would like to develop applications that use this technology to control the logic gate response of a drug’s delivery to a cell.”
Although platforms like TRUMPET are making progress, a lot more work must be done before they can be used commercially. “The process of translating mechanisms and architecture from biology to computing and vice versa is still an art rather than a science,” says Forrest. “It requires deep computer science and biology knowledge,” she adds. “Some people have compared interdisciplinary science to fusion restaurants—not all combinations are successful, but when they are, the results are remarkable.”
In today’s podcast episode, Leaps.org Deputy Editor Lina Zeldovich speaks about the health and ecological benefits of farming crickets for human consumption with Bicky Nguyen, who joins Lina from Vietnam. Bicky and her business partner Nam Dang operate an insect farm named CricketOne. Motivated by the idea of sustainable and healthy protein production, they started their unconventional endeavor a few years ago, despite numerous naysayers who didn’t believe that humans would ever consider munching on bugs.
Yet, making creepy crawlers part of our diet offers many health and planetary advantages. Food production needs to match the rise in global population, estimated to reach 10 billion by 2050. One challenge is that some of our current practices are inefficient, polluting and wasteful. According to nonprofit EarthSave.org, it takes 2,500 gallons of water, 12 pounds of grain, 35 pounds of topsoil and the energy equivalent of one gallon of gasoline to produce one pound of feedlot beef, although exact statistics vary between sources.
Meanwhile, insects are easy to grow, high on protein and low on fat. When roasted with salt, they make crunchy snacks. When chopped up, they transform into delicious pâtes, says Bicky, who invents her own cricket recipes and serves them at industry and public events. Maybe that’s why some research predicts that edible insects market may grow to almost $10 billion by 2030. Tune in for a delectable chat on this alternative and sustainable protein.
Listen on Apple | Listen on Spotify | Listen on Stitcher | Listen on Amazon | Listen on Google
Further reading:
More info on Bicky Nguyen
https://yseali.fulbright.edu.vn/en/faculty/bicky-n...
The environmental footprint of beef production
https://www.earthsave.org/environment.htm
https://www.watercalculator.org/news/articles/beef-king-big-water-footprints/
https://www.frontiersin.org/articles/10.3389/fsufs.2019.00005/full
https://ourworldindata.org/carbon-footprint-food-methane
Insect farming as a source of sustainable protein
https://www.insectgourmet.com/insect-farming-growing-bugs-for-protein/
https://www.sciencedirect.com/topics/agricultural-and-biological-sciences/insect-farming
Cricket flour is taking the world by storm
https://www.cricketflours.com/
https://talk-commerce.com/blog/what-brands-use-cricket-flour-and-why/
Lina Zeldovich has written about science, medicine and technology for Popular Science, Smithsonian, National Geographic, Scientific American, Reader’s Digest, the New York Times and other major national and international publications. A Columbia J-School alumna, she has won several awards for her stories, including the ASJA Crisis Coverage Award for Covid reporting, and has been a contributing editor at Nautilus Magazine. In 2021, Zeldovich released her first book, The Other Dark Matter, published by the University of Chicago Press, about the science and business of turning waste into wealth and health. You can find her on http://linazeldovich.com/ and @linazeldovich.