When Anthropic unveiled its new Mythos model in April, it also delivered a stern warning to anyone developing software. The model was so powerful at sniffing out software vulnerabilities, the lab claimed, that it had discovered thousands of high-severity bugs that would need to be fixed before it could be made public.
Now, security researchers for Mozilla’s Firefox browser are providing a closer look at what that process has looked like in practice, and what Mythos’ powers mean for software security at large.
In a post published on Thursday, Mozilla said Mythos has unearthed a wealth of high-severity bugs, including some that had lain dormant in the code for more than a decade.
That’s a significant improvement from what AI security tools were capable of even six months ago. Until now, AI bug-finding tools have come with severe drawbacks, often inundating security teams with low-quality reports and false positives. But Mozilla’s researchers say the latest generation of tools have turned a corner, particularly now that agentic systems can assess their own work and filter out bad results.
“It is difficult to overstate how much this dynamic changed for us over a few short months,” the researchers wrote. “First, the models got a lot more capable. Second, we dramatically improved our techniques for harnessing these models.”

The results are striking: In April 2026, Firefox shipped 423 bug fixes, compared to just 31 exactly a year earlier. The researchers have also published details on 12 of the bugs, which range from a pair of unusual sandbox vulnerabilities, to a 15-year-old error in how the browser parses an HTML element.
“These things are actually just suddenly very good,” Brian Grinstead, a distinguished engineer at Mozilla, told TechCrunch. “We see that on our own internal scanning, we see that on external bug reports, and we see that in all sorts of signals across the industry.”
The fact that the system helped reveal vulnerabilities in Firefox’s “sandbox” system is particularly impressive, given how intricate an attack that exploits it needs to be. To find sandbox vulnerabilities, the model must write a compromised patch for the browser, then attack the most secure part of the software with the new code implemented. Finding and demonstrating the bug is a delicate, multi-step process, requiring both creativity and close attention.
To put this into context, Mozilla’s bug bounty program pays researchers who can find a bug in Firefox’s sandbox up to $20,000 – the highest reward available. Despite the top-dollar bounty, however, Grinstead says Mythos is finding more sandbox issues than human researchers ever did. “We do get them,” he told TechCrunch, “but not at the volume that we are able to find with this technique.”
Notably, the Firefox team still isn’t using AI to fix the bugs, despite well-documented progress in AI coding tools. The team does ask AI to code up patches for each bug, but the resulting code usually can’t be deployed directly, and instead serves as a model for a human engineer.
“For the bugs we’re talking about in this post, every single one is one engineer writing a patch and one engineer reviewing it,” Grinstead says. “We have not found it to be automatable.”
It’s still not clear how AI’s emerging capabilities will change the broader balance of power in cybersecurity. One month since Mythos was previewed, most of the bugs discovered likely haven’t been patched, which makes it hard to capture the full scope of their impact. Anthropic has been scrupulous about following responsible disclosure norms, but it’s likely bad actors are using similar techniques behind the scenes, even if the models they’re using aren’t quite as good.
Speaking at a recent event, Anthropic CEO Dario Amodei was optimistic that the new tools would ultimately favor defenders. “If we handle this right, we could be in a better position than we started, because we fixed all these bugs. There are only so many bugs to find,” Amodei said. “So I think there’s a better world on the other side of this.”
Having dealt with the gritty details, Grinstead has a more measured view: “It’s useful for both attackers and defenders, but having the tool available shifts the advantage a little bit to defense. Realistically, nobody knows the answer to this yet.”
Author: Russell Brandom
Source: TechCrunch
Reviewed By: Editorial Team