The AI bubble is not a capability bubble. It is an expectation bubble. National security leaders are treating AI as a replacement for analysts, engineers, and tradecraft when it is really a volatile acceleration layer that still requires human judgment, security controls, and cost discipline.
The current state of AI is defined by inflated assumptions. Vendors overstate capability, users over-delegate judgment, and policymakers react to controlled demos as if they represent real-world operational power. The Mythos/Fable incident shows how quickly that confusion can become policy: the U.S. government treated access to a commercial model as a national-security transfer, forcing Anthropic to restrict access to its premier systems.
The problem is not that Mythos is too powerful. The problem is that institutions are starting to make decisions as if the marketing copy is reality. These systems are powerful, but they are not independent thinkers.
AI can surface information at extraordinary speed. It can summarize documents, generate code, translate foreign-language material, identify patterns, and automate repetitive tasks — but it cannot create new ground truth. It cannot determine whether a piece of intelligence is reliable, whether a cyber operation is lawful, or whether an analytic conclusion is strategically sound.
This is where the national-security conversation is going wrong. The debate keeps treating model capability as operational capability. They are not the same. A model that can describe a vulnerability is not the same as an operator who can exploit it. A model that can summarize a document is not the same as an analyst who can assess it. The more powerful these systems become, the more dangerous that distinction becomes.
AI does not exercise judgment, understand mission context, or carry accountability. It is an acceleration layer, and in the hands of trained users, it compresses time and expands reach. In the hands of institutions that mistake output for truth, it will accelerate error, overconfidence, and bad policy.
The bubble is bursting, but not because AI failed
The AI bubble is bursting because organizations bought the wrong story. They thought they were buying replacement labor. What they actually bought was an expensive, overconfident junior assistant: impressive in the interview and with first drafts, but unreliable when placed inside workflows that require judgment, context, and accountability.
Despite the rhetoric of AI replacing jobs, companies are starting to confront a harder reality: these systems can accelerate work, but they do not eliminate the need for people who understand the work. The danger is not simply that AI will produce bad output; the danger is that institutions will mistake that output for finished analysis.
AI is not cheap labor
AI is often sold as cheap replacement labor. The reality is much more nuanced: in proactive it is an expensive acceleration layer that still requires human judgment, review, and correction. At Shadow Nexus, we have AI integrated as a portion of our solution, but it is not the capability itself. Using AI in this manner helps us unlock information hidden in data that would be difficult to reach manually. But this has only worked because our tools requires a human to be involved every step of the way – providing course correction and validation.
That's what makes the "fully autonomous" pitch so misleading. The autonomy is really a system that, left unchecked, is prone to make mistakes and inflate costs.
Microsoft researchers recently tested how major frontier models perform in delegated workflows. They found that even frontier models corrupted an average of 25 percent of document content after 20 back-and-forth interactions, while the average across all tested models was about 50 percent degradation. Degradation worsened with larger documents, longer interactions, and distractor files.
The test was simple: give the model a document, ask it to make an edit, then ask it to get back to its original state. A reliable delegate will returns the document close to its original form. Instead, the errors compounded — like making a photocopy of a photocopy until the original slowly disappears.
The problem is further compounded by the constantly changing pricing model. Anthropic's Opus 4.7 tokenizer increased token usage by up to 35 percent (meaning the same text put into Opus 4.6 would require 35% less tokens). Then with the introduction of Fable 5 only a few months later, Anthropic doubled the published token price.
This rapid increase represents a serious procurement problem for corporations and government customers alike. Agencies can budget for seats, licenses, and fixed contracts. It is much harder to budget for agentic workflows that expand unpredictably through context growth, tool calls, retries, failed tasks, and human rework. That is not just sticker shock. It is meter opacity.
The Tradecraft Problem
Cost is only half the problem. Even at a price you can predict, AI introduces a subtler risk: it produces polished mistakes at scale — and in analytic environments, a polished mistake is far more dangerous than an obvious one.
AI hallucination is not just a chatbot problem. It becomes an institutional risk when generated text enters official documents, legal analysis, or intelligence reporting without source-level verification. Recently, Deloitte Australia agreed to partially refund the Australian government after a report it produced was found to contain AI-generated errors, including nonexistent references and fabricated quotes from a federal court judge.
For intelligence work, the analogy is obvious. A hallucinated citation is not a formatting error, it’s a provenance failure – and a hallucinated provenance chain can contaminate judgment, mislead decision-makers, and jeopardize missions. Don’t misunderstand me: This does not mean AI should be kept out of intelligence work. It means the tradecraft needs to evolve.
AI can be a force multiplier when used to accelerate research, translation, link analysis, and other repetitive analytic tasks - but it should not be treated as a replacement analyst. It has no concept of a larger context, which means it can’t understand legal authorities, operational risk, or true mission context. Those responsibilities still (and should always) belong to people. The right model is not “AI instead of analysts,” it is analysts using AI inside workflows. This requires changing the tradecraft to include a completely new way of thinking.
Which lands a government customer in an impossible spot: how do you adopt and rely on a tool that you can neither fully trust nor accurately budget for?
Government Adoption and the Rising China Problem
For both government and commercial users, the obvious response to rising AI costs is to move towards publicly available "open-weight" models. Systems like GLM-5.2 and Qwen-3.7 now rival the most advanced commercial models, improving cost predictability while keeping sensitive workflows inside government-controlled infrastructure. The catch: they're all designed and shipped from China.
That's what makes the recent Anthropic fight so revealing. Earlier this year, the Pentagon reportedly designated U.S.-based Anthropic a supply-chain risk after a dispute over its safeguards and military use of its models — even as China's GLM-5.2 ranks among the top systems on the market, just behind Anthropic's own Fable 5, with Alibaba's Qwen not far behind.
This is the irony the policy debate: government is trying to regulate a technology it doesn't fully understand, and much of that fear is driven by marketing. Fable 5 is powerful — but so are Opus and GPT-5.5. In the hands of a seasoned user, GPT-5.5 does just as much. As with every new technology, the danger isn't the tool. It's the user.
Meanwhile the drift is already underway. Microsoft recently signaled it may leverage China's DeepSeek model, even as the U.S. weighs blacklisting DeepSeek as a supply-chain risk. Assigning a supply chain risk to U.S. companies feels like an overstep when the trends show organizations moving toward models developed and controlled by adversarial nations.
AI is not going away, and no branding fight or access restriction will change that. The United States should treat AI as the new standard tool for analytic and operational work. But that is all it is: a tool. At its best, it's a starting point — a way to draft, accelerate research, and move faster through large volumes of information. That is also where the handoff to a human has to happen.
The Microsoft research and the Deloitte case are the warning. Left alone, generative AI does exactly what it is built to do: generate plausible output, regardless of accuracy. That risk only compounds as agencies look past closed U.S. models toward open-weight systems built by adversaries.
What happens when the model itself has been trained to nudge its answers — quietly, in a direction someone else chooses? Left uncaught, that kind of slow and deliberate data poisoning can corrupt the very work it's meant to support. That is the real supply-chain risk.
The real work should not be choosing which models we're allowed to use — it should be building the judgment to use them, and not mistaking model names for national-security strategy.
The Cipher Brief is committed to publishing a range of perspectives on national security issues submitted by deeply experienced national security professionals. Opinions expressed are those of the author and do not represent the views or opinions of The Cipher Brief.
Have a perspective to share based on your experience in the national security field? Send it to Editor@thecipherbrief.com for publication consideration.
Read more expert-driven national security insights, perspective and analysis in The Cipher Brief














