OpenAI Codex Security Launch: AI Agent That Catches Missed Vulnerabilities

Quick Brief

Codex Security launched March 6, 2026 in research preview for ChatGPT Enterprise, Business, and Edu customers with free usage for the first month
False positive rates on detections fell by more than 50% and over-reported severity findings dropped by more than 90% during beta testing
The agent scanned over 1.2 million commits in the last 30 days, identifying 792 critical findings and 10,561 high-severity issues
Codex Security discovered and helped report 14 CVEs across major open-source projects including OpenSSH, GnuTLS, GOGS, Thorium, libssh, PHP, and Chromium

OpenAI released Codex Security on March 6, 2026, and it targets one of the most persistent pain points in software development: security tools that generate more noise than signal. This agent combines agentic reasoning from OpenAI’s frontier models with automated validation to surface high-confidence vulnerabilities and propose fixes directly, cutting the triage burden that forces security teams to waste hours chasing false positives. What separates it from existing tools is its project-level threat model, built before the agent searches for a single vulnerability.

How Codex Security Works Differently From Standard Scanners

Most AI security tools operate without understanding what a system actually does, who it trusts, or where it is most exposed. Codex Security builds a project-specific threat model first, analyzing repository structure to understand security-relevant architecture before beginning any vulnerability search.

This context-first approach changes both what the agent finds and how it reports findings. When Codex Security identifies a potential issue, it pressure-tests that finding in a sandboxed validation environment to confirm real-world impact before surfacing it to the developer. The result is fewer speculative alerts and more actionable findings with clear remediation paths.

The agent operates in three distinct phases:

Build system context: Analyzes the repository and generates an editable threat model capturing what the system does, what it trusts, and where it is most exposed. Threat models can be edited to keep the agent aligned with the team.
Prioritize and validate issues: Uses the threat model as context, searches for vulnerabilities, and categorizes findings based on expected real-world impact. Where possible, it pressure-tests findings in sandboxed environments to distinguish signal from noise.
Patch with full context: Proposes fixes aligned with system intent and surrounding behavior, enabling patches that improve security while minimizing regressions, making them safer to review and land.

When Codex Security is configured with a project-specific environment, it can validate potential issues directly in the context of the running system, enabling working proof-of-concepts and giving security teams stronger evidence with a clearer path to remediation.

What the Beta Results Actually Showed

OpenAI did not launch Codex Security with theoretical benchmarks alone. The tool was formerly called Aardvark and ran as a private beta before today’s broader rollout.

During early internal deployments, Codex Security surfaced a real Server-Side Request Forgery (SSRF) vulnerability and a critical cross-tenant authentication bypass, both patched within hours by OpenAI’s own security team. Scans on the same repositories over time showed increasing precision, with noise cut by 84% in one case since initial rollout.

The scale of the beta underscores the agent’s production readiness. Over 1.2 million commits were scanned across external repositories in the past 30 days, with critical issues appearing in under 0.1% of scanned commits. That ratio demonstrates the system’s ability to process large volumes of code while maintaining a low false alarm rate, a metric that directly determines whether security teams adopt a tool or ignore it.

NETGEAR’s Head of Product Security, Chandan Nandakumaraiah, who is also a member of the CVE Board, described the findings as “impressively clear and comprehensive, often giving the sense that an experienced product security researcher was working alongside us.”

Open-Source Security: A Structural Commitment

OpenAI is using Codex Security to scan the open-source repositories it relies on most and sharing high-impact findings directly with maintainers. In conversations with maintainers during the beta, a consistent theme emerged: the challenge is not a lack of vulnerability reports but too many low-quality ones. This feedback directly shaped how OpenAI built Codex Security’s prioritization model.

The agent has already been credited with discovering 14 CVEs across widely used projects. The full list of CVEs disclosed in the official appendix includes:

GnuTLS certtool Heap-Buffer Overflow (Off-by-One) – CVE-2025-32990
GnuTLS Heap Buffer Overread in SCT Extension Parsing – CVE-2025-32989
GnuTLS Double-Free in otherName SAN Export – CVE-2025-32988
2FA Bypass in GOGS – CVE-2025-64175
Unauthenticated Bypass in GOGS – CVE-2026-25242
Path Traversal (arbitrary write) in Thorium – CVE-2025-35430
LDAP Injection – CVE-2025-35431
Unauthenticated DoS and mail abuse – CVE-2025-35432 and CVE-2025-35436
Session not rotated on password change – CVE-2025-35433
Disabled TLS verification (Elasticsearch client) – CVE-2025-35434
DoS: division by zero – CVE-2025-35435
gpg-agent stack buffer overflow via PKDECRYPT – CVE-2026-24881
Stack-based buffer overflow in TPM2 PKDECRYPT – CVE-2026-24882
CMS/PKCS7 AES-GCM ASN.1 params stack buffer overflow – CVE-2025-15467
PKCS#12 PBMAC1 PBKDF2 keyLength overflow and MAC bypass – CVE-2025-11187

OpenAI launched Codex for OSS alongside this rollout, offering free ChatGPT Pro and Plus accounts, code review support, and Codex Security access to open-source maintainers. Projects like vLLM have already integrated Codex Security into their normal workflow to find and patch issues. OpenAI plans to expand the program in the coming weeks and invites interested maintainers to apply directly.

Who Gets Access and How to Start

Codex Security is rolling out now to ChatGPT Enterprise, Business, and Edu customers via the Codex web interface, with free usage for the first month. Teams configure a repository scan, allow Codex Security to analyze and generate the initial threat model, then review and edit the model to align with their specific architecture.

The agent also learns from feedback over time. When a team adjusts the criticality of a finding, Codex Security uses that input to refine the threat model and improve precision on subsequent runs as it learns what matters in that specific architecture and risk posture.

For teams that configure project-specific environments, the agent can validate vulnerabilities directly in the context of the running system, enabling working proof-of-concept exploits that give security teams stronger evidence and a faster path to remediation.

Limitations to Consider

Codex Security is in research preview, meaning threat model quality at initial onboarding depends on how well the repository is structured and how much context teams provide. The agent’s deeper validation capabilities require project-specific environment configuration to unlock, adding setup overhead for smaller teams without dedicated security staff. Open-source maintainer access through Codex for OSS is currently limited to an initial onboarding cohort, with expansion planned in the coming weeks.

Frequently Asked Questions (FAQs)

What is OpenAI Codex Security?

Codex Security is OpenAI’s AI-powered application security agent. It builds a project-specific threat model, searches for vulnerabilities using agentic reasoning from OpenAI’s frontier models, validates findings in sandboxed environments, and proposes fixes aligned with system architecture and intent.

Who can access Codex Security right now?

As of March 6, 2026, Codex Security is rolling out in research preview to ChatGPT Enterprise, Business, and Edu customers via the Codex web interface. Usage is free for the first month. Open-source maintainers can apply separately through the Codex for OSS program.

How does Codex Security reduce false positives?

During beta testing, false positive rates on detections fell by more than 50% across all repositories, and over-reported severity findings dropped by more than 90%. The agent achieves this by pressure-testing potential findings in sandboxed validation environments before surfacing them rather than flagging every pattern match.

What open-source vulnerabilities has Codex Security discovered?

Codex Security helped discover and report 14 CVEs across widely used projects including OpenSSH, GnuTLS, GOGS, Thorium, libssh, PHP, and Chromium. Examples include three GnuTLS heap and double-free vulnerabilities, two GOGS authentication bypasses, and two gpg-agent stack buffer overflows.

Can Codex Security improve its accuracy over time?

Yes. When a team adjusts the criticality of a finding, Codex Security uses that feedback to refine its threat model and improve precision on subsequent scans. It learns what matters in a specific architecture and risk posture, compounding accuracy across repeated runs.

Can Codex Security write its own patches?

Yes. After identifying and validating a vulnerability, the agent proposes code fixes that account for system intent and surrounding behavior. This reduces regression risk and makes patches safer to review and merge. Teams with project-specific sandbox environments unlock deeper validation and stronger fix quality.

What was Codex Security called before its public launch?

Codex Security was formerly known as Aardvark. It began as a private beta with a small group of customers before its public research preview launch on March 6, 2026.

How does Codex for OSS support open-source maintainers?

Codex for OSS provides open-source maintainers with free ChatGPT Pro and Plus accounts, code review support, and access to Codex Security. OpenAI is currently onboarding an initial cohort and plans to expand the program in the coming weeks. Maintainers can apply via the official form at openai.com/form/codex-for-oss.

Search for an article

OpenAI Codex Security: The AI Agent That Catches Vulnerabilities Other Tools Miss