Big Data Probably Knows More About You Than Your Friends Do
Data is the new oil. It is highly valuable, and it is everywhere, even if you're not aware of it. For example, it's there when you use social media. Sharing pictures on Facebook lets its facial recognition software peg you and your friends. Thanks to that software, now anywhere you visit that has installed cameras, your face can be identified and your actions recorded.
The big data revolution is advancing much faster than the ones before, and it carries both promises and perils for humanity.
It's there when you log into Twitter, posting one of the 230 million tweets per day, which up until last month were all archived by the Library of Congress and will be made public for research. These social media data can be used to predict your political affiliations, ethnicity, race, age, how close you are with your family and friends, your mental health, even when you are most likely to be grumpy or go to the gym. These data can also predict when you are apt to get sick and track how diseases are spreading.
In fact, tracking isn't limited to what you decide to share or public spaces anymore. Lab experiments show Comcast and other cable companies may soon be able to record and monitor movements in your house. They may also be able to read your lips and identify your visitors simply by assessing how Wi-Fi waves bounce off bodies and other objects in houses. In one study, MIT researchers used routers and sensors to monitor breathing and heart rates with 99% accuracy. Routers could soon be used for seemingly good things, like monitoring infant breathing and whether an older adult is about to take a big tumble. However, it may also enable unwanted and unparalleled levels of surveillance.
Some call the first digital pill a snitch pill, medication with a tattletale, and big brother in your belly.
Big data is there every time you pick up your smartphone, which can track your daily steps, where you go via geolocation, what time you wake up and go to bed, your punctuality, and even your overall health depending on which features you have enabled. Are you close with your mom; are you a sedentary couch potato; did you commit a murder (iPhone data was recently used in a German murder trial)? Smartphone-generated data can be used to label you---and not just you, your future and past generations too.
Smartphones are not the only "things" gathering data on you. Anything with an on and off switch can be connected to the internet and generate data. The new rule seems to be, if it can be, it will be, connected. Washing machines, coffee makers, medical appliances, cars, and even your luggage (yes, someone created a self-driving suitcase) can and are often generating data. "Smart" refrigerators can monitor your food levels and automatically create shopping lists and order food for you—while recording your alcohol consumption and whether you tend to be a healthy or junk food eater.
Even medicines can monitor behaviors. The first digital pill was just approved by the FDA last November to track whether patients take their medicines. It has a sensor that sends signals to a patient's smartphone, and others, when it encounters stomach acid. Some call it a snitch pill, medication with a tattletale, and big brother in your belly. Others see it as a major breakthrough to help patients remember to take their medications and to save payers millions of dollars.
Big data is there when you go shopping. Credit card and retail data can show whether you pay for a gym, if you are pregnant, have children, and your credit-worthiness. Uber and Lyft transactional data reveal what time you usually go to and leave work and who you regularly visit (Uber data has been used to catch cheating spouses).
Amazon now sells a bedroom camera to see your fashion choices and offer advice. It is marketing a more fashionable you, but it probably also wants the video feed showing your body measurements—they're "a newly prized currency," according to the Washington Post. They help retailers create more customized and better fitting clothes. Amazon also just partnered with Berkshire Hathaway and JPMorgan Chase, the largest bank in the United States by assets, to create an independent health-care company for their employees--raising privacy concerns as Amazon already owns so much data about us, from drones, devices, the AI of Alexa, and our viewing, eating, and other purchasing habits on Amazon Prime.
Data generation and storage can also be used to make the world better, safer and fairer.
Big data is arguably a new phenomenon; almost all the world's data (90%) were produced within the last 2 years or so. It is a result of the fusion of physical, digital, and biological technologies that together constitute the fourth industrial revolution, according to the World Economic Forum. Unlike the last three revolutions, involving the discoveries of steam power, electrical energy, and computers—this revolution is advancing much faster than the ones before and it carries both promises and perils for humanity.
Some people may want to opt out of all this tracking, reduce their digital footprint and stay "off the grid." However, it is worth noting that data generation and storage can be used for great things --- things that make the world better, safer and fairer. For example, sharing electronic health records and social media data can help scientists better track and understand diseases, develop new cures and therapies, and understand the safety and efficacy profiles of medicines and vaccines.
While full of promise, big data is not without its pitfalls. Data are often not interoperable or easily integrated. You can use your credit card practically anywhere in the world, but you cannot easily port your electronic health record to the doctor or hospital across the street, for example.
Data quality can also be poor. It is dependent on the person entering it. My electronic health record at one point said I was male, and I was pregnant at the time. No doctors or nurses seemed to notice. The problem is worse on a global level. For example, causes of death can be coded differently by country and village. Take HIV patients: they often develop secondary infections, like TB. Do you record the cause of death as TB or HIV? There isn't global consistency, and political pressure from patient groups can exert itself on death records. Often, each group wants to say they have the most deaths so they can fundraise more money.
Data can be biased. More than 80 percent of genomic data comes from Caucasians. Only 14 percent is from Asians and 3.5 percent is from African and Hispanic populations. Thus, when scientists use genomic data to develop drugs or lab tests, they may create biased products that work for only some demographics. Take type 2 diabetes blood tests; some do not work well for African Americans. One study estimates that 650,000 African Americans may have undiagnosed diabetes, because a common blood test doesn't work for them. Using biased data in medicine can be a matter of life and death. Moreover, if genomic medicine benefits only "a privileged few," the practice raises concerns about unequal access.
Large companies are selling data that originated from you and you are not sharing in the wealth.
We need to think carefully and be transparent about the values embedded in our data, data analytics (algorithms), and data applications. Numbers are never neutral. Algorithms are always embedded with subjective normative values--sometimes purposely, sometimes not. To address this problem, we need ethicists who can audit databanks and algorithms to identify embedded norms, values and biases and help ensure they are addressed or at least transparently disclosed. Additionally, we need to determine how to let people opt out of certain types of data collection and uses—and not just at the beginning of a system, but also at any point in their lifetimes. There is a right to be forgotten, which hasn't been adequately operationalized in today's data sphere.
What do you think happens to all of these data collected about us? The short answer is the public doesn't really know. A lot of it looks like what is in a medical record—i.e. height, weight, pregnancy status, age, mental health, pulse, blood pressure, and illness symptoms--- yet, it isn't protected by HIPPA, like your medical record information.
And it is being consolidated into the hands of fewer and fewer big players. Large companies are selling data that originated from you and you are not sharing in the wealth.
A possible solution is to create an app, managed by a nonprofit or public benefit corporation, through which you could download and manage all the data collected about you. For example, you could download your credit card statements with all your purchasing habits, your Uber rides showing transit patterns, medical records, electric bills, every digital record you have and would like to download--into one application. You would then have the power to license pieces or the collection of your data to users for a small fee for one year at a time. Uses and users could be monitored and audited leveraging blockchain capabilities. After the year is up, you can withdraw access.
You could be your own data landlord. We could democratize big data and empower people to better control and manage the wealth of information collected about us. Why should only the big companies like Amazon and Apple profit off the new oil? Let's create an app so we can all manage our data wealth and maybe even become data barons—an app created by the people for the people.
Have You Heard of the Best Sport for Brain Health?
The Friday Five covers five stories in research that you may have missed this week. There are plenty of controversies and troubling ethical issues in science – and we get into many of them in our online magazine – but this news roundup focuses on scientific creativity and progress to give you a therapeutic dose of inspiration headed into the weekend.
Listen on Apple | Listen on Spotify | Listen on Stitcher | Listen on Amazon | Listen on Google
Here are the promising studies covered in this week's Friday Five:
- Reprogram cells to a younger state
- Pick up this sport for brain health
- Do all mental illnesses have the same underlying cause?
- New test could diagnose autism in newborns
- Scientists 3D print an ear and attach it to woman
Can blockchain help solve the Henrietta Lacks problem?
Science has come a long way since Henrietta Lacks, a Black woman from Baltimore, succumbed to cervical cancer at age 31 in 1951 -- only eight months after her diagnosis. Since then, research involving her cancer cells has advanced scientific understanding of the human papilloma virus, polio vaccines, medications for HIV/AIDS and in vitro fertilization.
Today, the World Health Organization reports that those cells are essential in mounting a COVID-19 response. But they were commercialized without the awareness or permission of Lacks or her family, who have filed a lawsuit against a biotech company for profiting from these “HeLa” cells.
While obtaining an individual's informed consent has become standard procedure before the use of tissues in medical research, many patients still don’t know what happens to their samples. Now, a new phone-based app is aiming to change that.
Tissue donors can track what scientists do with their samples while safeguarding privacy, through a pilot program initiated in October by researchers at the Johns Hopkins Berman Institute of Bioethics and the University of Pittsburgh’s Institute for Precision Medicine. The program uses blockchain technology to offer patients this opportunity through the University of Pittsburgh's Breast Disease Research Repository, while assuring that their identities remain anonymous to investigators.
A blockchain is a digital, tamper-proof ledger of transactions duplicated and distributed across a computer system network. Whenever a transaction occurs with a patient’s sample, multiple stakeholders can track it while the owner’s identity remains encrypted. Special certificates called “nonfungible tokens,” or NFTs, represent patients’ unique samples on a trusted and widely used blockchain that reinforces transparency.
Blockchain could be used to notify people if cancer researchers discover that they have certain risk factors.
“Healthcare is very data rich, but control of that data often does not lie with the patient,” said Julius Bogdan, vice president of analytics for North America at the Healthcare Information and Management Systems Society (HIMSS), a Chicago-based global technology nonprofit. “NFTs allow for the encapsulation of a patient’s data in a digital asset controlled by the patient.” He added that this technology enables a more secure and informed method of participating in clinical and research trials.
Without this technology, de-identification of patients’ samples during biomedical research had the unintended consequence of preventing them from discovering what researchers find -- even if that data could benefit their health. A solution was urgently needed, said Marielle Gross, assistant professor of obstetrics, gynecology and reproductive science and bioethics at the University of Pittsburgh School of Medicine.
“A researcher can learn something from your bio samples or medical records that could be life-saving information for you, and they have no way to let you or your doctor know,” said Gross, who is also an affiliate assistant professor at the Berman Institute. “There’s no good reason for that to stay the way that it is.”
For instance, blockchain could be used to notify people if cancer researchers discover that they have certain risk factors. Gross estimated that less than half of breast cancer patients are tested for mutations in BRCA1 and BRCA2 — tumor suppressor genes that are important in combating cancer. With normal function, these genes help prevent breast, ovarian and other cells from proliferating in an uncontrolled manner. If researchers find mutations, it’s relevant for a patient’s and family’s follow-up care — and that’s a prime example of how this newly designed app could play a life-saving role, she said.
Liz Burton was one of the first patients at the University of Pittsburgh to opt for the app -- called de-bi, which is short for decentralized biobank -- before undergoing a mastectomy for early-stage breast cancer in November, after it was diagnosed on a routine mammogram. She often takes part in medical research and looks forward to tracking her tissues.
“Anytime there’s a scientific experiment or study, I’m quick to participate -- to advance my own wellness as well as knowledge in general,” said Burton, 49, a life insurance service representative who lives in Carnegie, Pa. “It’s my way of contributing.”
Liz Burton was one of the first patients at the University of Pittsburgh to opt for the app before undergoing a mastectomy for early-stage breast cancer.
Liz Burton
The pilot program raises the issue of what investigators may owe study participants, especially since certain populations, such as Black and indigenous peoples, historically were not treated in an ethical manner for scientific purposes. “It’s a truly laudable effort,” Tamar Schiff, a postdoctoral fellow in medical ethics at New York University’s Grossman School of Medicine, said of the endeavor. “Research participants are beautifully altruistic.”
Lauren Sankary, a bioethicist and associate director of the neuroethics program at Cleveland Clinic, agrees that the pilot program provides increased transparency for study participants regarding how scientists use their tissues while acknowledging individuals’ contributions to research.
However, she added, “it may require researchers to develop a process for ongoing communication to be responsive to additional input from research participants.”
Peter H. Schwartz, professor of medicine and director of Indiana University’s Center for Bioethics in Indianapolis, said the program is promising, but he wonders what will happen if a patient has concerns about a particular research project involving their tissues.
“I can imagine a situation where a patient objects to their sample being used for some disease they’ve never heard about, or which carries some kind of stigma like a mental illness,” Schwartz said, noting that researchers would have to evaluate how to react. “There’s no simple answer to those questions, but the technology has to be assessed with an eye to the problems it could raise.”
To truly make a difference, blockchain must enable broad consent from patients, not just de-identification.
As a result, researchers may need to factor in how much information to share with patients and how to explain it, Schiff said. There are also concerns that in tracking their samples, patients could tell others what they learned before researchers are ready to publicly release this information. However, Bogdan, the vice president of the HIMSS nonprofit, believes only a minimal study identifier would be stored in an NFT, not patient data, research results or any type of proprietary trial information.
Some patients may be confused by blockchain and reluctant to embrace it. “The complexity of NFTs may prevent the average citizen from capitalizing on their potential or vendors willing to participate in the blockchain network,” Bogdan said. “Blockchain technology is also quite costly in terms of computational power and energy consumption, contributing to greenhouse gas emissions and climate change.”
In addition, this nascent, groundbreaking technology is immature and vulnerable to data security flaws, disputes over intellectual property rights and privacy issues, though it does offer baseline protections to maintain confidentiality. To truly make a difference, blockchain must enable broad consent from patients, not just de-identification, said Robyn Shapiro, a bioethicist and founding attorney at Health Sciences Law Group near Milwaukee.
The Henrietta Lacks story is a prime example, Shapiro noted. During her treatment for cervical cancer at Johns Hopkins, Lacks’s tissue was de-identified (albeit not entirely, because her cell line, HeLa, bore her initials). After her death, those cells were replicated and distributed for important and lucrative research and product development purposes without her knowledge or consent.
Nonetheless, Shapiro thinks that the initiative by the University of Pittsburgh and Johns Hopkins has potential to solve some ethical challenges involved in research use of biospecimens. “Compared to the system that allowed Lacks’s cells to be used without her permission, Shapiro said, “blockchain technology using nonfungible tokens that allow patients to follow their samples may enhance transparency, accountability and respect for persons who contribute their tissue and clinical data for research.”
Read more about laws that have prevented people from the rights to their own cells.