BS
BOLD STATEMENTS· LAB
Home/Gary Marcus/Prediction
techhealth

GPT-4 successor models will be shown to have fundamental reliability limitations preventing deployment in high-stakes medical settings without heavy human oversight.

Gary Marcus·June 2026
⚖️Partially True— Resolved June 10, 2026
How we verified this claim
LLMs have demonstrated significant hallucination and reliability issues limiting autonomous deployment in high-stakes domains. Medical and legal deployments exist but require substantial human oversight, supporting Marcus's core concern while falling short of a blanket prohibition.
How we reached this verdict
LLMs have demonstrated significant hallucination and reliability issues limiting autonomous deployment in high-stakes domains. Medical and legal deployments exist but require substantial human oversight, supporting Marcus's core concern while falling short of a blanket prohibition.
Share
𝕏Share on X
More from Gary Marcus