O VOO DO CORVO: Anthropic Claims Its New A.I. Model, Mythos, Is a Cybersecurity ‘Reckoning’

quinta-feira, 23 de abril de 2026

Anthropic Claims Its New A.I. Model, Mythos, Is a Cybersecurity ‘Reckoning’

the shift

Anthropic Claims Its New A.I. Model, Mythos, Is a Cybersecurity ‘Reckoning’

The company said on Tuesday that it was holding back on releasing the new technology but was working with 40 companies to explore how it could prevent cyberattacks.

Kevin Roose

By Kevin Roose

Reporting from San Francisco

https://www.nytimes.com/2026/04/07/technology/anthropic-claims-its-new-ai-model-mythos-is-a-cybersecurity-reckoning.html

April 7, 2026

Anthropic, the artificial intelligence company that recently fought the Pentagon over the use of its technology, has built a new A.I. model that it claims is too powerful to be released to the public.

Instead, Anthropic said on Tuesday, it will make the new model — known as Claude Mythos Preview — available to a consortium of more than 40 technology companies, including Apple, Amazon and Microsoft, which will use the model to find and patch security vulnerabilities in critical software programs.

Anthropic said it had no plans to release its new technology more widely, but was announcing the new model’s capabilities in one area in particular — identifying security vulnerabilities in software — in an effort to sound the alarm over what the company believes will be a new, scarier era of A.I. threats.

“The goal is both to raise awareness and to give good actors a head start on the process of securing open-source and private infrastructure and code,” Jared Kaplan, Anthropic’s chief science officer, said in an interview.

The coalition, known as Project Glasswing, will include some of Anthropic’s competitors in A.I., such as Google, as well as hardware providers like Cisco and Broadcom, and organizations that maintain critical open-source software, such as the Linux Foundation. Anthropic is committing up to $100 million in Claude usage credits to the effort.

Logan Graham, the head of an Anthropic team that tests new models for dangerous capabilities, called the new model “the starting point for what we think will be an industry change point, or reckoning, with what needs to happen now.”

Anthropic occupies an unusual position in today’s A.I. landscape. It is racing to build increasingly powerful A.I. systems, and making billions of dollars selling access to those systems, while also drawing attention to the risks its technology poses. The company was deemed a supply-chain risk this year by the Pentagon for demanding certain limitations to the use of its technology. A federal judge later stopped the designation from going into effect.

Anthropic has not released much new information about the model, which was code-named “Capybara” during development. But after some details were inadvertently leaked last month, the company acknowledged that it considered it a “step change” in A.I. capabilities, with improved performance in areas like coding and cybersecurity research.

The company’s decision to hold back Claude Mythos Preview, while giving access only to partners out of concern for how it might be misused, has some precedent. In 2019, OpenAI announced it had built a new model, GPT-2, but was not releasing the full version right away. The company claimed that its text-generation capabilities could be used to automate the mass-production of propaganda or misinformation. (It later released the model, after conducting additional safety testing on it.) Many of the leaders of the GPT-2 project later left OpenAI to start Anthropic.

This time, Anthropic is making a different, more urgent claim. The company’s executives say Claude Mythos Preview is already capable of carrying out autonomous security research, including scanning for and exploiting so-called zero-day vulnerabilities in critical software programs, flaws that are unknown even to the software’s developer. These efforts can often be triggered by amateurs with simple prompts. The company claims that the new model has already identified “thousands” of bugs and vulnerabilities in popular software programs, including every major operating system and browser.

One of the vulnerabilities Claude found, the company said, was a 27-year-old bug in OpenBSD, an open-source operating system that was designed to be difficult to hack. Many internet routers and secure firewalls incorporate OpenBSD’s technology. Another was a longstanding issue in a piece of popular video software that automated testing tools had scanned five million times, without finding any problems.

“This model is good at finding vulnerabilities that would be well understood and findable by security researchers,” Mr. Graham said. “At the same time, it has found vulnerabilities, and in some cases crafted exploits, sophisticated enough that they were both missed by literally decades of security researchers, as well as all the automated tools designed to find them.”

Anthropic announced on Monday that its projected annual revenue had more than tripled in 2026, to more than $30 billion from $9 billion. The growth has come largely because of the popularity of Anthropic’s Claude as a tool for programming.

Anthropic has focused on making Claude good at completing lengthy coding tasks, in hopes of making it more useful to professional programmers and amateur “vibecoders.” But an A.I. system designed to be good at coding is also good at spotting the flaws in code — running automated scans for bugs and vulnerabilities that can allow hackers to take control of users’ machines, expose sensitive user information or wreak other havoc.

The cybersecurity industry has been bracing for years for what more capable A.I. models could do to critical tech infrastructure. Until recently, only expert human researchers with access to specialized tools were capable of finding the most severe security vulnerabilities. Now, the fear is that a powerful A.I. model could discover them on its own.

“Imagine a horde of agents methodically cataloging every weakness in your technology infrastructure, constantly,” Nikesh Arora, the chief executive of Palo Alto Networks, wrote in a blog post last week.

Mr. Graham said one of the unanswered questions about Claude Mythos Preview, and other future models that will be capable of doing similar things, was whether most or all of the world’s critical software would need to be patched or rewritten as a result of these new models.

“There are a lot of really critical systems around the world, whether it’s physical infrastructure or things that protect your personal data, that are running on old versions of code,” Mr. Graham said. “If these previously were mostly secure because it took a lot of human effort to attack them, does that paradigm of security even work anymore?”

It is wise to take claims about unreleased model capabilities from A.I. companies with a grain of salt. In this case, though, cybersecurity researchers who have been given access to Claude Mythos Preview have characterized the model as a significant cybersecurity risk.

Elia Zaitsev, the chief technology officer of CrowdStrike, a cybersecurity firm with access to the new model through Project Glasswing, said in a statement accompanying Anthropic’s announcement that the model “demonstrates what is now possible for defenders at scale, and adversaries will inevitably look to exploit the same capabilities.”

“What once took months now happens in minutes with A.I.,” Mr. Zaitsev said.

Project Glasswing takes its name from the glasswing butterfly, Mr. Kaplan said, which uses transparent wings to hide in plain sight. Similarly, he said, many of today’s most critical software programs contain bugs and vulnerabilities that have existed in the open for years, but were buried in such complex technical systems that no human ever found them.

According to Mr. Kaplan, the cybersecurity capabilities of Claude Mythos Preview are not a result of special training. Rather, they are just one of many areas in which the model is better than previous ones. He predicted that similar cybersecurity capabilities would exist in other models soon. As that happens, he said, the arms race between hackers and the companies racing to defend their systems will only escalate.

“As the slogan goes, this is the least capable model we’ll have access to in the future,” he said.

Kevin Roose is a Times technology columnist and a host of the podcast "Hard Fork."

Sem comentários:

Enviar um comentário

quinta-feira, 23 de abril de 2026

Anthropic Claims Its New A.I. Model, Mythos, Is a Cybersecurity ‘Reckoning’

Sem comentários:

Mensagens populares

A minha Lista de blogues

Número total de visualizações de páginas

Arquivo do blogue

Acerca de mim