Quick Brief
- Claude Opus 4.6 discovered 22 confirmed vulnerabilities in Firefox over two weeks, with 14 rated high-severity
- Those 14 high-severity flaws represent nearly one-fifth of all high-severity Firefox vulnerabilities remediated in all of 2025
- Mozilla shipped patches to hundreds of millions of users in Firefox 148.0
- Claude found its first Firefox vulnerability inside the JavaScript engine in just 20 minutes
Your Firefox browser just became measurably safer because an AI worked faster than any human security team could. Anthropic’s Claude Opus 4.6 partnered with Mozilla to confirm 22 vulnerabilities over two weeks, a volume and speed that resets expectations for software security entirely. These are not research projections. The patches are live in Firefox 148, protecting your browser right now.
Claude Found Firefox’s First Bug in 20 Minutes
The test started with a focused assignment: task Claude Opus 4.6 with finding novel vulnerabilities in the current version of Firefox, beginning with its JavaScript engine. After just 20 minutes of autonomous exploration, Claude reported identifying a Use After Free vulnerability, a class of memory flaw that can allow attackers to overwrite data with arbitrary malicious content.
One Anthropic researcher independently validated the bug in a separate virtual machine running the latest Firefox release, then forwarded it to two additional Anthropic researchers, who also confirmed it. The team filed a bug report in Bugzilla, Mozilla’s issue tracker, along with a description of the vulnerability and a proposed patch written by Claude and validated by the reporting team.
By the time that first report was submitted, Claude had already identified 50 more unique crashing inputs.
What 22 Confirmed Vulnerabilities Actually Looks Like
Over the course of the collaboration, Anthropic scanned nearly 6,000 C++ files in Firefox and submitted a total of 112 unique reports. Mozilla triaged those reports and issued 22 CVEs as a direct result of the work. Of those, 14 were classified as high-severity.
Beyond the 22 security-sensitive bugs, Anthropic’s analysis also uncovered 90 additional bugs in Firefox, most of which Mozilla has since fixed. Some lower-severity findings overlapped with issues typically found through traditional fuzzing, but the model also identified distinct classes of logic errors that fuzzing had not previously uncovered.
Mozilla confirmed that all 22 CVEs are now fixed in the latest version of Firefox.
Why Firefox Was Chosen as the Test Target
Anthropic selected Firefox deliberately, not arbitrarily. Firefox is one of the most scrutinized and security-hardened open-source codebases on the web, making it a genuinely hard test for an AI vulnerability scanner. Hundreds of millions of users rely on it daily, and browser vulnerabilities carry particular danger because users routinely encounter untrusted content and depend on the browser to keep them safe.
The choice proved the point. Despite decades of extensive fuzzing, static analysis, and regular security reviews, Claude still surfaced vulnerabilities that had not been caught. Mozilla’s own engineers described this as analogous to the early days of fuzzing, suggesting a substantial backlog of now-discoverable bugs likely exists across widely deployed software.
How Claude’s Approach Differs From Traditional Security Scanning
Standard fuzzing tools work by feeding software large volumes of unexpected inputs to trigger crashes. They are powerful but limited to the patterns they can generate. Claude operated differently: it reasoned through Firefox’s code structure, traced data flows, and identified logic errors that fuzzers had not previously uncovered.
Anthropic’s team also implemented what they call a “task verifier,” a trusted tool that gives the AI agent real-time feedback as it explores a codebase. The verifier allows Claude to check its own work continuously, iterating until each finding is confirmed before submission. Mozilla independently patched the task verifiers Anthropic used during the discovery process.
When submitting reports, Anthropic included three elements that Mozilla’s team identified as critical to trusting the results:
- Minimal test cases that allowed engineers to quickly verify and reproduce each issue
- Detailed proofs-of-concept
- Candidate patches written by Claude and validated by human researchers
The Cost Equation That Changes the Security Landscape
Anthropic spent approximately $4,000 in API credits running exploit-development tests. Claude was given access to the vulnerabilities already submitted to Mozilla and asked to turn each one into a working exploit. Despite running this test several hundred times with different starting points, Claude successfully converted a vulnerability into an actual exploit in only two cases.
This result carries two direct implications. First, Claude is significantly better at finding vulnerabilities than at exploiting them. Second, the cost of identifying a vulnerability is an order of magnitude lower than the cost of creating a functional exploit from it. For now, that gap strongly favors defenders.
What the “Crude” Exploit Label Actually Means
The two successful exploit cases require important context. The exploits Claude wrote only functioned in a test environment where Firefox’s sandbox was intentionally disabled. The sandbox is a core Firefox security feature designed specifically to reduce the impact of these types of vulnerabilities. Firefox’s defense-in-depth architecture would have been effective against these particular exploits under real-world conditions.
Anthropic acknowledged this caveat directly, noting that vulnerabilities that escape the sandbox are not unheard of and that Claude’s attack represents one necessary component of an end-to-end exploit chain. The Frontier Red Team blog contains a separate technical write-up on how one of these exploits was developed.
Firefox 148 and the Collaboration That Made It Happen
The Mozilla partnership worked because both sides adjusted their processes. Mozilla’s engineers reached out to Anthropic during the early phase of the project after receiving the first few validated reports. They encouraged Anthropic to submit all findings in bulk without individually validating each crashing input, allowing the collaboration to scale quickly.
Within hours of receiving reports, Mozilla’s platform engineers began landing fixes. All 22 CVEs and the majority of the 90 additional bugs were patched in Firefox 148.0, with remaining issues scheduled for upcoming releases. Mozilla’s engineers Brian Grinstead and Christian Holler confirmed the collaboration in the official Mozilla blog post and noted that their team has already begun integrating AI-assisted analysis into internal security workflows.
| Metric | Verified Result |
|---|---|
| Time to first vulnerability | 20 minutes |
| C++ files scanned | Nearly 6,000 |
| Total reports submitted to Mozilla | 112 |
| Confirmed CVEs issued | 22 |
| High-severity CVEs | 14 |
| Share of Firefox’s 2025 high-severity fixes | Nearly 1 in 5 |
| Additional non-security bugs found | 90 |
| API cost for exploit testing | ~$4,000 |
| Successful exploit conversions | 2 out of several hundred attempts |
| Firefox version patched | 148.0 |
Claude Code Security: Now Available to Developers
Anthropic has released Claude Code Security as a limited research preview, bringing vulnerability-discovery and patching capabilities directly to customers and open-source maintainers. The Firefox collaboration reflects a broader pattern: before this project, Claude Opus 4.6 had already been used to find vulnerabilities in other important software projects, including the Linux kernel.
Anthropic has also published its Coordinated Vulnerability Disclosure operating principles, describing the procedures it will follow when working with software maintainers. The company has committed to significantly expanding its cybersecurity efforts, including working directly with developers to search for vulnerabilities, developing tools to help maintainers triage bug reports, and proposing patches.
What You Should Do Right Now
If you have not updated Firefox since February 2026, you are running a browser with unfixed versions of the vulnerabilities Claude discovered. Update to Firefox 148 immediately through Help in the browser menu or your system’s package manager. All 22 CVEs from this collaboration are confirmed patched in that release.
For developers and open-source maintainers: Anthropic’s published guidance recommends including minimal test cases, detailed proofs-of-concept, and candidate patches when submitting AI-assisted vulnerability reports. The same best practices apply whether you are using Claude or any other LLM-powered security tool.
Limitations and Considerations
Claude’s exploit-development success rate was extremely low: two conversions out of several hundred test attempts, and only in a sandbox-disabled environment. Firefox’s real-world defense-in-depth architecture would have blocked these specific attacks. The model is currently a far stronger vulnerability finder than an exploit developer, which means defenders retain the advantage today. Anthropic has stated this gap is unlikely to last indefinitely as models continue to improve.
Frequently Asked Questions (FAQs)
What did Claude AI find in Firefox?
Claude Opus 4.6 discovered 22 confirmed vulnerabilities in Firefox over two weeks. Mozilla classified 14 of these as high-severity and issued 22 CVEs as a result of the collaboration. The model also found 90 additional non-security bugs. All identified issues are fixed in Firefox 148.
Is Firefox 148 safe to use after the Anthropic discovery?
Yes. Mozilla confirmed all 22 CVEs from this collaboration are fixed in Firefox 148.0. The fixes were shipped before any active exploitation was detected. Mozilla also confirmed that Firefox’s sandbox and defense-in-depth architecture would have blocked the crude exploits Claude developed during testing.
How does Claude find security vulnerabilities in software?
Claude reasons through code structure and traces data flows rather than simply matching patterns against known vulnerability databases. It uses task verifiers, tools that provide real-time feedback as the agent explores a codebase, allowing it to check and iterate on its own findings before submitting a report.
What is Claude Code Security and who can access it?
Claude Code Security is a limited research preview released by Anthropic that brings vulnerability-discovery and patching capabilities to customers and open-source maintainers. It is available through Claude Code on the web. Anthropic has not published specific tier restrictions in the primary sources for this article.
Can Claude AI create working browser exploits?
Only in very limited conditions. Claude successfully turned a discovered vulnerability into a working exploit in two out of several hundred test attempts. Those exploits only functioned with Firefox’s sandbox disabled, a configuration that does not reflect real-world browser use. Firefox’s built-in defenses would have stopped these specific attacks.
Why did Anthropic specifically choose Firefox for this test?
Firefox was selected because it is one of the most well-tested and security-hardened open-source projects in the world, making it a significantly harder test than typical open-source software. Hundreds of millions of users rely on it daily, and its codebase has undergone decades of fuzzing, static analysis, and manual security review.
What should developers learn from this collaboration?
Anthropic recommends that teams using AI-powered security tools include minimal reproducible test cases, detailed proofs-of-concept, and candidate patches when filing bug reports. This approach allowed Mozilla to quickly verify and act on 112 submitted reports. Anthropic has published its Coordinated Vulnerability Disclosure principles to guide future collaborations.
What happens next with Anthropic’s security research?
Anthropic plans to significantly expand its cybersecurity efforts, including continued vulnerability searches across open-source software, tools to help maintainers triage reports faster, and direct patch proposals. The company has also warned that the current gap between AI’s vulnerability-finding and exploit-development abilities is unlikely to last as models continue to improve.

