Skip to main content
On Air Now
Listen Now

6pm to 9pm

Listen Now

3pm to 7pm

AI system deemed 'too dangerous to release to the public' by its developer

Bosses at Anthropic say its Claude Mythos Preview model is "dangerous" because of how good it is at finding critical security flaws in computer systems

Share

Bosses at Anthropic say its Claude Mythos Preview model is "dangerous" because of how good it is at finding critical security flaws in computer systems
Bosses at Anthropic say its Claude Mythos Preview model is "dangerous" because of how good it is at finding critical security flaws in computer systems. Picture: Alamy

By Frankie Elliott

A Silicon Valley start-up has blocked access to its latest AI system because it could wreak havoc if it ended up in the wrong hands.

Listen to this article

Loading audio...

Bosses at Anthropic say its Claude Mythos Preview model is "dangerous" because of how good it is at finding critical security flaws in computer systems.

The tech company said its system could “reshape cybersecurity” if it was released to the public, having already discovered thousands of security vulnerabilities on popular web browsers and operating systems.

Read more: 'Obsessive and predatory' child abuser who groomed girl, 14, he met on gaming platform Roblox jailed

Read more: Tesco trials AI assistant to help customers with meal planning

Access will instead be given to the world's top technology companies - including Amazon, Apple and Microsoft -under an agreement called Project Glasswing.
Access will instead be given to the world's top technology companies - including Amazon, Apple and Microsoft -under an agreement called Project Glasswing. Picture: Getty

Access will instead be given to the world's top technology companies - including Amazon, Apple and Microsoft - under an agreement called Project Glasswing.

Under this deal, these tech giants will be able to fix any security flaws the Claude Mythos Preview has discovered.

Anthropic said it had no plans to make the new AI system available to everyone, but hoped it would release similar powerful systems in the future.

One of the biggest fears surrounding advanced AI systems is hacking because of how accurately the technology can write computer code and automate attacks.

Mythos would allow people with no security training to discover major flaws in software, Anthropic said.

The company said: "Engineers at Anthropic with no formal security training have asked Mythos Preview to find remote code execution vulnerabilities overnight and [have] woken up the following morning to a complete, working exploit."

Among its many impressive attributes, Mythos also has the ability to find ways to evade the controls the company has put on it.

Engineers found it developed a way to edit files it had been banned from accessing, before taking steps to hide this behaviour from human evaluators.

One researcher said he received an email from a version of the system that had been blocked from having internet access.

Logan Graham, the head of Anthropic’s “red team” that works on stress-testing AI systems, said: “Capabilities in a model like this could do harm if in the wrong hands and so we won’t be releasing this model widely.”

The model was able to find a 27-year-old bug in OpenBSD, an operating system designed for highly secure systems, that it said could let hackers “potentially bring down corporate networks or core internet services”.

It marks a huge step forward in hacking capabilities, partly because it is able to discover several bugs that might be harmless on their own and connect them in a way that can crash or gain access to a system.

As well as running Project Glasswing for the world's top tech companies, Anthropic was also in “ongoing discussions” with the US government about Mythos.

US Defence Secretary Pete Hegseth branded the company a "supply chain risk" last month, after falling out with the firm's bosses because of their refusal to let its technology be used for autonomous weapons.

Anthropic won a temporary injunction on the order last month.

Countries including North Korea and Iran have been found using AI systems to conduct cyber-attacks.