Diagnostics Isn’t Broken. Our Model for Evaluating It Is

by David Berry, MD PhD

The diagnostics sector has had a rough few years in investor sentiment. Covid lateral flow tests handed out false confidence. The GRAIL Galleri trial disappointed. The narrative seems to have calcified: diagnostics is hard, outcomes are elusive, the TAM never quite materializes. That narrative is wrong. And it’s creating a meaningful opportunity for those willing to read the data carefully.

The sensitivity problem nobody is talking about

Let’s start with the Galleri numbers. In the NHS trial, the test’s sensitivity was 17% at Stage I, 40% at Stage II, 77% at Stage III, and 90% at Stage IV. The test catches fewer than one in five cancers at the stage where catching them most materially changes outcomes.

So here is the question that should have been asked before the trial read out: with 17% sensitivity at Stage I, what mortality benefit are you expecting to see across a diverse patient population in a three-to-five year follow-up window?

Not much. Not because liquid biopsy doesn’t work. Because the arithmetic doesn’t work at that sensitivity level. To move population mortality numbers, you need to catch enough cancers, early enough, consistently enough.

A test that misses 83% of Stage I cancers has a hard ceiling on what any trial can prove, regardless of how it’s designed. The NHS-Galleri trial did show a greater than 20% reduction in Stage IV diagnoses by years two and three, and a four-fold improvement in overall cancer detection rate versus standard of care alone. That’s clinically meaningful — but the market focused on the missed primary endpoint and concluded the category was broken, rather than that this generation of the technology has a known, improvable sensitivity constraint. Those are very different conclusions.

The broader diagnostic crisis hiding in plain sight

Covid LFTs are perhaps the most visible instance of a much older problem: tools deployed without adequate understanding of what they can and cannot tell you. A lateral flow test with moderate sensitivity is not a bad test. It is a specific kind of instrument, useful for specific purposes, and catastrophically misused when a negative result becomes permission to ignore symptoms—recall our views that when we didn’t see the line show up, we deemed ourselves Covid free—that’s not what the data says. The same category of error underlies stroke misdiagnosis and sepsis overdiagnosis — confident conclusions drawn from information that didn’t warrant them.

795,000 Americans suffer serious harm from misdiagnosis every year. Stroke, sepsis, pulmonary embolism, and lung cancer together account for nearly 40% of those cases. Stroke alone causes 17.5% of all serious diagnostic harm — frequently because patients present with atypical symptoms, get discharged without imaging, and aren’t caught until it’s too late.

Sepsis is even more paradoxical: one retrospective study found that 43% of patients admitted to an ICU with a sepsis diagnosis likely didn’t have an infection, while simultaneously, true sepsis cases are routinely missed until the patient is critically ill. The test is wrong, or the interpretation is wrong, or both — and the consequences are severe in either direction.

Where the real opportunity is

We need to collectively start asking “where does better information, faster, change the clinical pathway in ways that are economically legible today?”

The answer is: almost everywhere you look—as long as we use diagnostics for what they are intended.

Chemeleon is building for the most acute version of this problem. Their lead product is a colorimetric AMI diagnostic — a heart attack rule-out test that delivers a visible yes/no readout in under two minutes, with no instruments, no training, and no lab. If you are sitting at home with chest pain at 2AM, that test changes everything about what happens next. The economics are straightforward: avoiding a single unnecessary emergency admission more than pays for the test. At scale, across a health system genuinely motivated to reduce ED congestion, that value is enormous.

Abbott’s i‑STAT TBI cartridge addresses the same structural problem from a different angle — a whole-blood test for traumatic brain injury usable at the bedside or point-of-care, enabling faster triage decisions for one of the most time-sensitive and consistently misdiagnosed conditions in emergency medicine. The pattern is the same: put accurate, actionable information at the point of decision, before the costly and often incorrect default kicks in.

Biolinq represents a different dimension of the same thesis. Their Shine platform — the first needle-free wearable to win FDA De Novo classification — continuously tracks glucose, lactate, and other metabolic markers without breaking the skin. The initial clearance covers glucose monitoring for Type 2 diabetes, but the platform architecture points toward something more significant: continuous, multi-analyte monitoring that surfaces metabolic deterioration before it becomes an emergency. Biolinq is a long-term infrastructure bet on the idea that the most valuable diagnostic is the one you’re already wearing.

On the liquid biopsy side, Freenome is doing something the GRAIL narrative obscured: proving the concept works when sensitivity is actually high enough. Their PREEMPT CRC blood test — published in JAMA in 2025 and now licensed to Exact Sciences ahead of a 2026 commercial launch — detects 85% of colorectal cancers at 90% specificity. That’s a meaningfully different number from where multi-cancer tests currently sit, and it’s the result of a deliberate, focused bet: go deep on one cancer, get the sensitivity right, and build the clinical evidence properly. Freenome’s trajectory is a proof point for the whole category, not just a single company story.

Looking further ahead, the convergence of rapid molecular testing and point-of-care deployment is erasing another assumed constraint — that PCR-grade accuracy requires a lab. Companies like Sensible Diagnostics are building instruments that perform PCR in under ten minutes at the point of care. When that becomes commodity infrastructure, the clinical pathways for infectious disease, sepsis triage, and antibiotic stewardship all change materially.

The opportunity

The market is pricing diagnostics as a category that doesn’t work. What the data actually shows is that poorly calibrated tools, deployed without clinical judgment, in studies under-powered relative to the test’s known sensitivity profile, produce predictable underperformance. That is not the same thing as the category being broken.

Freenome shows what happens when you do this right. Chemeleon and Biolinq show what happens when you ask a different question entirely. As sensitivity curves improve and health systems get serious about decongesting emergency care, the narrative will shift. The companies doing the work aren’t priced for where this goes. The window won’t stay open.

Diagnostics isn’t broken. Our model for evaluating them is.