Clinical AI Is Failing—Not on Accuracy, But on Who Pays

June 24, 2026

·

3 min

The Central Paradox: Accuracy Is Not the Problem

Clinical artificial intelligence has arrived in the examination room, the documentation workflow, and the diagnostic suite with considerable fanfare—and, increasingly, with considerable technical credibility. Yet despite a growing body of evidence supporting AI performance across radiology, psychiatry, primary care documentation, and clinical decision support, the transition from pilot program to routine care remains elusive for the vast majority of deployed tools.

A June 2026 Viewpoint published in JAMA Health Forum by Shunsuke Takagi, PhD, and Genichi Sugihara, PhD, of the Institute of Science Tokyo, cuts directly to the heart of this paradox. Their central contention is both straightforward and structurally significant: "Clinical AI increasingly fails not on accuracy but on economics: no one can name and commit to the payer required to keep it safe and reliable in routine care." This financing gap, they argue, is not a peripheral administrative concern—it is the primary mechanism by which technically sound AI tools fail to achieve durable adoption.

The implications for private practice physicians, hospital administrators, and clinical leaders are immediate. Every AI-enabled documentation system, diagnostic aid, or workflow optimization tool deployed within a care setting carries with it a set of recurring costs—integration support, cybersecurity governance, performance monitoring, model updating, and clinician training—that extend well beyond the point of initial acquisition. When responsibility for these costs is undefined at the outset, the technology's long-term viability is compromised regardless of its clinical merit.

The Structural Misalignment Driving Adoption Failure

Takagi and Sugihara identify a structural misalignment that operates at multiple levels of the healthcare system. Organizations asked to fund AI acquisition and ongoing maintenance are not always those positioned to capture the downstream savings the technology generates. A care delivery organization may bear the full cost of implementation, workflow redesign, and regulatory compliance—while a payer or insurer captures the actuarial benefit of reduced utilization or improved population-level outcomes.

This misalignment becomes most consequential at the point of evidence generation. As the authors observe, "Insurers demand proof that requires the very adoption they refuse to fund"—a genuinely circular condition that traps promising tools in a pre-deployment limbo. The evidence thresholds required for reimbursement coverage cannot be met without the scale of adoption that reimbursement itself would enable.

The problem is compounded when multiple stakeholders hold competing definitions of value. A physician may define AI value as reduced documentation burden and reclaimed time with patients. A hospital administrator may prioritize throughput and revenue cycle efficiency. An insurer may measure value exclusively through cost-effectiveness ratios. When the same system is expected to satisfy all of these constituencies simultaneously, the authors note, "the choice of success metric is itself contested, and that contest is often settled implicitly by the paying entity." In practice, this means that whoever controls the budget defines the purpose of the AI—a dynamic with significant implications for patient-centered care.

Medicare Data Illustrates the Scope of the Problem

The financial stakes of this misalignment are not theoretical. Takagi and Sugihara draw on the June 2024 report of the Medicare Payment Advisory Commission (MedPAC) to illustrate how narrow and fragile existing payment pathways truly are. Among all software technologies separately payable as service items under the outpatient prospective payment system in 2022, only a single diagnostic AI tool recorded meaningful volume and spending: 8,665 uses generating $8.4 million in reimbursement. The overwhelming majority of other covered software items recorded little or no utilization during the same period.

This finding is striking for what it reveals about the gap between payment pathway existence and payment pathway utility. A coverage mechanism is not, by itself, sufficient to drive adoption. The covered slice of reimbursable value—typically limited to the computational analysis step—may be dramatically thinner than the full cost of safe, ongoing operation when integration, governance, auditability, and continuous monitoring are properly accounted for.

International Policy Contrast: The German DiGA Model

The authors situate the US experience within a broader international policy landscape, drawing a pointed contrast with Germany's Digital Health Applications (DiGA) framework. Under DiGA, eligible digital health applications may be prescribed and reimbursed, including through time-limited or conditional coverage designed to enable evidence generation during active reimbursed use. This model explicitly trades some upfront evidentiary certainty for a funded mechanism that supports learning while technology is deployed in real clinical environments.

The contrast with US norms is instructive: "Different regimes thus choose different answers to the same question: who finances the period of uncertainty between a plausible prototype and mature evidence?" The US currently defaults to evidence-first norms in which reimbursement follows evidence generation—a sequencing that may be appropriate for traditional therapeutics but creates structural barriers for software-based tools that are designed to improve iteratively through real-world use.

A Framework for Adoption: Four Declarations

Rather than offering aspirational recommendations, Takagi and Sugihara propose a concrete operational framework—four explicit declarations that should be established by developers and adopting organizations before any clinical AI system enters routine deployment:

1. Payer Clarity. Who will finance not only acquisition but the full recurring costs of safe operation—computation, integration support, cybersecurity, governance, monitoring, and model updates? If costs are shared, what contractual mechanisms govern each party's obligations?

2. Beneficiary Primacy. Who is the primary beneficiary of the system—patients, clinicians, health systems, or payers? When interests conflict, which takes precedence, and what structural safeguards prevent optimization of surrogate metrics at the expense of therapeutic quality?

3. Metric Alignment. What outcomes will define value for the payer and for the primary beneficiary? Plausible metrics include clinician time reclaimed, patient access and care continuity, validated outcome measures, safety signals, and equity impacts—each accompanied by prespecified thresholds for scaling, revision, or discontinuation.

4. Life Cycle Accountability. Who holds the authority and institutional responsibility to monitor performance degradation over time, communicate limitations to clinical users, and oversee model updates—and who funds these obligations when real-world operations produce trade-offs not anticipated at deployment?

The authors are careful to position these declarations as complements to, not replacements for, clinical study evidence.

Implications for Clinical Leadership

For physicians and healthcare administrators navigating AI procurement decisions, the Takagi and Sugihara framework offers a disciplined evaluative lens. The absence of explicit answers to these four questions at the point of contracting should be treated not as an administrative gap but as a clinical risk—one that can compromise patient safety, institutional accountability, and the long-term financial sustainability of AI deployment.

The authors' concluding formulation deserves to stand as a guiding principle for any clinical leader evaluating AI adoption: "The key question is not whether clinical AI works, but for whom it creates measurable value." Until that question is answered with specificity—and underwritten with contractual commitment—the gap between AI promise and AI performance will persist, regardless of how capable the underlying technology becomes.

Related Posts

Blog Post Image

June 29, 2026

·

8 min

Top 5 AI Companies Transforming Private Practice Now

Five AI companies are reshaping private practice medicine—saving physicians up to 3 hours per day in documentation

Blog Post Image

June 24, 2026

·

3 min

Clinical AI Is Failing—Not on Accuracy, But on Who Pays

Clinical AI tools are failing not because they don't work—but because no one agrees on who pays to keep them running.

Blog Post Image

May 26, 2026

·

4 min

Ambient AI Slashes Doctor Documentation Time—Here's the Proof

A landmark JAMA Network Open study of 1,547 clinicians found that ambient AI scribes immediately reduced time spent in notes by 0.26 minutes per appointment — with after-hours documentation declining significantly over time.

Blog Post Image

June 8, 2026

·

7 min

AI Is Reshaping Medicine—But Are Doctors Losing Their Edge?

AI adoption in healthcare is accelerating dramatically—nearly three-quarters of doctors now use it weekly, up from just 38% last year.

Blog Post Image

May 18, 2026

·

6 min

When AI Drafts the Note, Physicians Recover

A prospective Stanford pilot deployed AI-generated hospital discharge summaries across 384 discharges — and physicians used them 57% of the time.

Blog Post Image

April 29, 2026

·

4 min

When AI Answers First, Learning Never Happens

AI may not just deskill practicing physicians — it may prevent trainees from ever developing clinical reasoning at all.