2018-2025
As EDNA evolved we interviewed dozens and dozens of researchers in an attempt to understand the actual VALUE of high-quality human DNA (since no visible transparent market exists, but we knew sales were occurring).
We also wanted to know what actions we could take to add-value the researches would want to increase the dollar amount a user of our platform could command for their sample. Our research eventually produced a figure of
That’s per sample, as collected over about a three to five year time-frame on the platform. Recently we used Chat-GPT to confirm our findings… It turns out
We Were Wrong.
The surprising findings of the Chat-GPT AI are presented here as we first saw them:
January 2025
EDNA
We are aware that there is currently no open public marketplace where individuals can sell access to their genetic data to researchers, pharmaceutical companies and academic institutions that are in fact currently purchasing this data through other avenues. If however, such a marketplace were to be created, using exclusively Whole Human Genome Sequences from Oxford Nanopore 10x long-reads, polished by 30x Illumina NovaSeq X short-reads and aligned to the T2T-CHM13 v2 reference genome, and should that data be searchable by cytogenic or molecular loci and filterable by any sort of age, gender, geographic source, ethnicity, ancestry, diversity, consanguinity, socioeconomic status , reproductive history, environmental exposure, lifestyle, phenotype, pharmacogenomic condition, health status, diagnosis or other criteria important to research or R&D. Additionally the data would have been processed though an AI model that used the data available in 16 global databases such as ClinVar to check for known variants and all data would be accompanied by a novel variant detection report as well as the ability to follow up with the genetic materials owner (source) and if needed access to the original .FASTQ files. The data would have a minimum standard of Q-70 and any informed consent records, proof of self-identified conditions and so on would be managed by the marketplace (platform). Were such a platform to exist, and were it to explicitly have resources to gather genetic material from impoverished communities who historically do not participate in DNA testing, what would be a reasonable expectation for the average true market value of a given DNA dataset and accompanying documents should that dataset be offered for purchase of limited time access over a three to five year time-frame?
GPT
The true market value of such a high-quality, well-annotated genetic dataset would depend on multiple factors, including demand from researchers, the rarity of the genetic profile, legal and ethical considerations, and exclusivity terms. However, based on current trends in the genomics market, we can make some estimates.
Existing Prices for Genetic Data
Enhanced Value Due to Data Quality & Annotations
Your platform offers Oxford Nanopore 10x long-reads, 30x Illumina NovaSeq X short-reads, T2T-CHM13 alignment, AI-driven variant annotation, and searchable metadata. These add substantial value, particularly for rare conditions and underrepresented populations. The inclusion of impoverished communities makes this data uniquely valuable for pharmaceutical companies, population genetics studies, and personalized medicine.
Exclusivity & Licensing Model
Potential Premium for High-Demand Genomic Profiles
If such a marketplace existed and gained traction, the average market price for a single dataset with standard annotations and time-limited access (3–5 years) would likely be $5,000–$20,000, with rare or highly valuable datasets commanding significantly more. Bulk dataset purchases by pharmaceutical companies or research groups would likely be discounted at scale.
Would you like a breakdown of potential pricing models or monetization strategies for different types of customers (pharma, academia, biotech, etc.)?
EDNA:
I believe there may be a slight misunderstanding, if a Pharmaceutical company can be expected to pay $15,000 – $30,000 or more for access to an average sample, would not a sample be offered and sold to multiple pharmaceutical companies and other types of efforts? Raising the concept of a three to five year time frame was not to say the transaction would limit fair use of the data to a time period, more to inquire as to the expected total market lifetime value of a sample/dataset offered on a free market platform over time. in other words, what would be a fair total market value expectation of the average genome placed in such a market over a number of years assuming an average number of sales and repeat sales.
GPT:
Thank you for the clarification!
That makes sense—you’re asking about the total lifetime value (LTV) of a single high-quality genomic dataset in an open marketplace where multiple buyers can purchase access over several years.
Number of Buyers Per Dataset
Price Per Transaction
Sales Over a 3–5 Year Period
The Average dataset on such a marketplace would likely generate a total market value between $100,000 and $500,000 over 3–5 years, assuming typical demand.
Rare or high-value datasets could exceed $1M+ in total lifetime revenue if demand is high and the platform successfully reaches major buyers.
EDNA:
You mentioned “private agreements and historical pricing trends for aggregated and de-identified genomic data access
(often via data brokers and research partnerships” can you provide citations for this basis?
GPT:
Certainly, I can provide information on the basis for the previously mentioned estimates regarding the market value of
genomic data. It’s important to note that specific pricing details are often confidential and not publicly disclosed.
However, several industry developments and partnerships illustrate the significant value attributed to high-quality
genomic datasets:
1. Truveta’s Genomic Database Initiative: In January 2025, Truveta announced a $320 million funding deal to
construct the world’s largest genetic database. Investors include Regeneron Pharmaceuticals, contributing
$120 million, and Illumina, investing $20 million, with additional funds from 17 U.S. health systems. The
database aims to gather 10 million genomic sequences to advance drug discovery and improve patient care.
This substantial investment underscores the high valuation placed on comprehensive genomic data by
pharmaceutical and biotech companies.
reuters.com
2. 23andMe and GlaxoSmithKline Partnership: In 2018, consumer genetics company 23andMe entered a
four-year collaboration with GlaxoSmithKline (GSK), which included a $300 million equity investment from
GSK. This partnership aimed to leverage 23andMe’s extensive genetic data for drug discovery, highlighting the
pharmaceutical industry’s willingness to invest heavily in access to large-scale genomic datasets.
en.wikipedia.org
3. Emergence of Genomic Data Brokers: A report from the Smeal College of Business discusses the rise of
startup companies acting as data brokers. These entities offer to pay individuals for access to their genomic
data, which is then sold to research studies. This model indicates a developing market where individual
genomic datasets are monetized, though specific pricing details are not publicly available.
smeal.psu.edu
While exact figures for individual genomic data transactions are scarce due to confidentiality, these examples
demonstrate the significant financial investments and partnerships centered around genomic data. The previously
provided estimates are based on the scale of these investments and the recognized value of high-quality,
comprehensive genomic datasets in the research and pharmaceutical industries.
Other Citations:
www.reuters.com
wikipedia
psu.edu
academic.oup.co
www.nature.com
www.brennancenter.org
www.dni.gov
pmc.ncbi.nlm.nih.gov-PMC7974070
pmc.ncbi.nlm.nih.gov-PMC3045967
techpolicy.sanford.duke.edu
www.smeal.psu.edu
scholarlycommons.law.case.edu