Quick Brief
- Codex Security launched March 6, 2026 in research preview for ChatGPT Enterprise, Business, and Edu customers with free usage for the first month
- False positive rates on detections fell by more than 50% and over-reported severity findings dropped by more than 90% during beta testing
- The agent scanned over 1.2 million commits in the last 30 days, identifying 792 critical findings and 10,561 high-severity issues
- Codex Security discovered and helped report 14 CVEs across major open-source projects including OpenSSH, GnuTLS, GOGS, Thorium, libssh, PHP, and Chromium
OpenAI released Codex Security on March 6, 2026, and it targets one of the most persistent pain points in software development: security tools that generate more noise than signal. This agent combines agentic reasoning from OpenAI’s frontier models with automated validation to surface high-confidence vulnerabilities and propose fixes directly, cutting the triage burden that forces security teams to waste hours chasing false positives. What separates it from existing tools is its project-level threat model, built before the agent searches for a single vulnerability.
How Codex Security Works Differently From Standard Scanners
Most AI security tools operate without understanding what a system actually does, who it trusts, or where it is most exposed. Codex Security builds a project-specific threat model first, analyzing repository structure to understand security-relevant architecture before beginning any vulnerability search.
This context-first approach changes both what the agent finds and how it reports findings. When Codex Security identifies a potential issue, it pressure-tests that finding in a sandboxed validation environment to confirm real-world impact before surfacing it to the developer. The result is fewer speculative alerts and more actionable findings with clear remediation paths.
The agent operates in three distinct phases:
- Build system context: Analyzes the repository and generates an editable threat model capturing what the system does, what it trusts, and where it is most exposed. Threat models can be edited to keep the agent aligned with the team.
- Prioritize and validate issues: Uses the threat model as context, searches for vulnerabilities, and categorizes findings based on expected real-world impact. Where possible, it pressure-tests findings in sandboxed environments to distinguish signal from noise.
- Patch with full context: Proposes fixes aligned with system intent and surrounding behavior, enabling patches that improve security while minimizing regressions, making them safer to review and land.
When Codex Security is configured with a project-specific environment, it can validate potential issues directly in the context of the running system, enabling working proof-of-concepts and giving security teams stronger evidence with a clearer path to remediation.
What the Beta Results Actually Showed
OpenAI did not launch Codex Security with theoretical benchmarks alone. The tool was formerly called Aardvark and ran as a private beta before today’s broader rollout.
During early internal deployments, Codex Security surfaced a real Server-Side Request Forgery (SSRF) vulnerability and a critical cross-tenant authentication bypass, both patched within hours by OpenAI’s own security team. Scans on the same repositories over time showed increasing precision, with noise cut by 84% in one case since initial rollout.
The scale of the beta underscores the agent’s production readiness. Over 1.2 million commits were scanned across external repositories in the past 30 days, with critical issues appearing in under 0.1% of scanned commits. That ratio demonstrates the system’s ability to process large volumes of code while maintaining a low false alarm rate, a metric that directly determines whether security teams adopt a tool or ignore it.
NETGEAR’s Head of Product Security, Chandan Nandakumaraiah, who is also a member of the CVE Board, described the findings as “impressively clear and comprehensive, often giving the sense that an experienced product security researcher was working alongside us.”
Open-Source Security: A Structural Commitment
OpenAI is using Codex Security to scan the open-source repositories it relies on most and sharing high-impact findings directly with maintainers. In conversations with maintainers during the beta, a consistent theme emerged: the challenge is not a lack of vulnerability reports but too many low-quality ones. This feedback directly shaped how OpenAI built Codex Security’s prioritization model.
The agent has already been credited with discovering 14 CVEs across widely used projects. The full list of CVEs disclosed in the official appendix includes:
- GnuTLS certtool Heap-Buffer Overflow (Off-by-One) – CVE-2025-32990
- GnuTLS Heap Buffer Overread in SCT Extension Parsing – CVE-2025-32989
- GnuTLS Double-Free in otherName SAN Export – CVE-2025-32988
- 2FA Bypass in GOGS – CVE-2025-64175
- Unauthenticated Bypass in GOGS – CVE-2026-25242
- Path Traversal (arbitrary write) in Thorium – CVE-2025-35430
- LDAP Injection – CVE-2025-35431
- Unauthenticated DoS and mail abuse – CVE-2025-35432 and CVE-2025-35436
- Session not rotated on password change – CVE-2025-35433
- Disabled TLS verification (Elasticsearch client) – CVE-2025-35434
- DoS: division by zero – CVE-2025-35435
- gpg-agent stack buffer overflow via PKDECRYPT – CVE-2026-24881
- Stack-based buffer overflow in TPM2 PKDECRYPT – CVE-2026-24882
- CMS/PKCS7 AES-GCM ASN.1 params stack buffer overflow – CVE-2025-15467
- PKCS#12 PBMAC1 PBKDF2 keyLength overflow and MAC bypass – CVE-2025-11187
OpenAI launched Codex for OSS alongside this rollout, offering free ChatGPT Pro and Plus accounts, code review support, and Codex Security access to open-source maintainers. Projects like vLLM have already integrated Codex Security into their normal workflow to find and patch issues. OpenAI plans to expand the program in the coming weeks and invites interested maintainers to apply directly.
Who Gets Access and How to Start
Codex Security is rolling out now to ChatGPT Enterprise, Business, and Edu customers via the Codex web interface, with free usage for the first month. Teams configure a repository scan, allow Codex Security to analyze and generate the initial threat model, then review and edit the model to align with their specific architecture.
The agent also learns from feedback over time. When a team adjusts the criticality of a finding, Codex Security uses that input to refine the threat model and improve precision on subsequent runs as it learns what matters in that specific architecture and risk posture.
For teams that configure project-specific environments, the agent can validate vulnerabilities directly in the context of the running system, enabling working proof-of-concept exploits that give security teams stronger evidence and a faster path to remediation.
Limitations to Consider
Codex Security is in research preview, meaning threat model quality at initial onboarding depends on how well the repository is structured and how much context teams provide. The agent’s deeper validation capabilities require project-specific environment configuration to unlock, adding setup overhead for smaller teams without dedicated security staff. Open-source maintainer access through Codex for OSS is currently limited to an initial onboarding cohort, with expansion planned in the coming weeks.
ChatGPT for Excel Changes How Analysts, Accountants, and Researchers Handle Spreadsheet Work in 2026
Frequently Asked Questions (FAQs)
What is OpenAI Codex Security?
Codex Security is OpenAI’s AI-powered application security agent. It builds a project-specific threat model, searches for vulnerabilities using agentic reasoning from OpenAI’s frontier models, validates findings in sandboxed environments, and proposes fixes aligned with system architecture and intent.
Who can access Codex Security right now?
As of March 6, 2026, Codex Security is rolling out in research preview to ChatGPT Enterprise, Business, and Edu customers via the Codex web interface. Usage is free for the first month. Open-source maintainers can apply separately through the Codex for OSS program.
How does Codex Security reduce false positives?
During beta testing, false positive rates on detections fell by more than 50% across all repositories, and over-reported severity findings dropped by more than 90%. The agent achieves this by pressure-testing potential findings in sandboxed validation environments before surfacing them rather than flagging every pattern match.
What open-source vulnerabilities has Codex Security discovered?
Codex Security helped discover and report 14 CVEs across widely used projects including OpenSSH, GnuTLS, GOGS, Thorium, libssh, PHP, and Chromium. Examples include three GnuTLS heap and double-free vulnerabilities, two GOGS authentication bypasses, and two gpg-agent stack buffer overflows.
Can Codex Security improve its accuracy over time?
Yes. When a team adjusts the criticality of a finding, Codex Security uses that feedback to refine its threat model and improve precision on subsequent scans. It learns what matters in a specific architecture and risk posture, compounding accuracy across repeated runs.
Can Codex Security write its own patches?
Yes. After identifying and validating a vulnerability, the agent proposes code fixes that account for system intent and surrounding behavior. This reduces regression risk and makes patches safer to review and merge. Teams with project-specific sandbox environments unlock deeper validation and stronger fix quality.
What was Codex Security called before its public launch?
Codex Security was formerly known as Aardvark. It began as a private beta with a small group of customers before its public research preview launch on March 6, 2026.
How does Codex for OSS support open-source maintainers?
Codex for OSS provides open-source maintainers with free ChatGPT Pro and Plus accounts, code review support, and access to Codex Security. OpenAI is currently onboarding an initial cohort and plans to expand the program in the coming weeks. Maintainers can apply via the official form at openai.com/form/codex-for-oss.

