Anthropic
said on Friday it will "abruptly disable" its most advanced artificial intelligence (AI) models, Claude Fable 5 and Mythos 5, for all users after the U.S. government ordered it to suspend access to the models for foreign nationals, whether inside or outside the U.S., citing national security concerns.
The AI company said it received an order at 5:21 p.m. ET, instructing it to suspend all access to the models by foreign nationals. It said that it believed there was a "misunderstanding" and that it is working to restore access to the models as soon as possible. Access to other models will not be affected by the export control directive.
"Our understanding is that the government believes it has become aware of a method of bypassing, or 'jailbreaking' Fable 5," the company said.
"We reviewed a demonstration of this specific technique being used to identify a small number of previously known, minor vulnerabilities. These vulnerabilities all appear relatively simple, and we have found that other publicly-available models are able to discover them as well without requiring a bypass."
The unexpected move comes days after the
launch of Claude Fable 5 and its counterpart Mythos 5, which uses the same underlying model but with the safeguards lifted in some areas, like cybersecurity. The latter, described as having the "strongest cybersecurity capabilities of any model in the world," remains accessible to a vetted group of cyber defenders and critical infrastructure operators.
Anthropic emphasized that it has implemented "strong" guardrails to prevent the misuse of its models for
cybersecurity-related tasks. Specifically, this is undergirded by a set of safety classifiers that are used to detect potential misuse, including jailbreak attempts, and prohibit the main model from responding.
The cybersecurity classifier is designed to block harmful single-turn requests relating to planning a cyber attack, exploit development, or defense evasion, with the company noting that Mythos-class models are skilled at finding and exploiting software vulnerabilities, thereby giving attackers a strategic advantage.
Last week, Anthropic revealed its Mythos-class model can turn newly disclosed software vulnerabilities into working exploits in hours, or even minutes in some cases, instead of weeks, converting N-days into N-hours. The findings suggest that frontier models may be just as good at rapidly weaponizing flaws that have been publicly disclosed.
"A lone operator can now turn a month's worth of patches into working exploits in a single afternoon - for a few thousand dollars and with no specialized expertise," Anthropic's Red Team
said. "This means that the typical patching playbook that software developers use today - with monthly release cadences, multi-week staged rollouts, and a lag between pre-release and stable channels - no longer holds."
Fable 5's protections mean that queries on cybersecurity topics will instead receive a response from Claude Opus 4.8, the company's next capable model.
In its latest statement, the company argued that no universal jailbreak methods have been developed against the latest models to date, adding that third-party and internal red-teaming exercises have found its safeguards to be "substantially more effective than those of any previously deployed model."
Furthermore, Anthropic claimed that "the perfect jailbreak resistance" is not possible for any model provider, as every safeguard used by the industry is susceptible to non-universal jailbreaks that are "effective in very limited contexts or require additional effort to be adapted to each new situation."
"To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws," it said.
"Our understanding is that one potential jailbreak was shared with the government. We have reviewed a report that we believe is the basis of the government's directive and validated that the level of capability displayed there is widely available from other models (including OpenAI's GPT-5.5), and is used every day by the defenders who keep systems safe."
Anthropic also pointed out that while it's all for the government to block unsafe AI deployments, it said the discovery of a "narrow potential jailbreak" shouldn't be the reason for recalling a commercial model that's deployed widely. The statutory process should be "transparent, fair, clear, and grounded in technical facts," it added.
Earlier this year, the U.S. Department of Defense
labeled Anthropic a "supply chain risk" after the Claude-maker sought to draw red lines over the military use of its technology. The company has filed two lawsuits to block the designation.
<small>Source: The Hacker News</small>