NIST report identifies significant privacy gaps in genomic data handling
A new National Institute of Standards and Technology (NIST) report on the cybersecurity of genomic data found major privacy gaps in how the data is generated, stored and shared.
The paper argues that a NIST privacy framework focusing on the uniqueness of genomic data sensitivity should be established to help organizations that aggregate the data identify regulatory gaps in how privacy is guarded and help create safer systems.
NIST found significant “gaps” in the genomic data generation system, including weaknesses in safe sharing of the data; inadequate monitoring; processing vulnerabilities; lacking guidance to organizations handling sensitive genomic data; and scant regulation addressing national security and privacy threats in how the data is collected, retained and aggregated.
The reports’ authors recommend using a federated type of encryption to solve the problem, arguing it would “virtually eliminate the risk of confidentiality or integrity loss of sharing genomic data between organizations and solve the confinement problem.”
Such a system would aggregate encrypted data across multiple datasets and prevent raw data from being exfiltrated by ensuring that even authorized users are only able to get results without accessing raw data in “plain text.”
The authors concede that current technology would not support such a system across the board — it is currently used in oncology and precision medicine research — but recommend the U.S. government undertake a “demonstration project” to gauge if the technique could be used more widely.
The paper comes on the heels of an October hack of the genetic testing company 23andMe, which affected 6.9 million people, including more than one million users of Jewish Ashkenazi descent. The hacker reportedly asked customers for as little as $1 per individual genetic profile.
A major difficulty in wrangling the privacy threats embedded in the genomic data system stems from the need to share it within a broad research community. Yet the consequences of breaches are significant, the report says.
Cyberattacks aimed at exfiltrating genomic data can harm individuals by “enabling intimidation for financial gain, discrimination based on disease risk, and privacy loss from revealing hidden consanguinity or phenotypes including health, emotional stability, mental capacity, appearance, and physical abilities,” the report states.
At the same time, the sharing of genomic data is vital to the U.S. research community, government, and private industry as they seek to develop drugs and generally maintain America’s biotechnological “competitive advantage,” the report says.
The scale of the genomic data sharing needed to support research is massive, the report says, pointing to the fact that in 2021 the National Institutes of Health fielded nearly 40,000 requests for data access to its three million genotype “microarray datasets” and 500,000-plus whole genome sequences.
The report notes that breaches targeting genomic data not only threaten individuals but also their entire family.
Suzanne Smalley
is a reporter covering privacy, disinformation and cybersecurity policy for The Record. She was previously a cybersecurity reporter at CyberScoop and Reuters. Earlier in her career Suzanne covered the Boston Police Department for the Boston Globe and two presidential campaign cycles for Newsweek. She lives in Washington with her husband and three children.