Codex 5.3 Released: 77.3% Terminal-Bench, 56.8% SWE-Bench Pro

OpenAI Launches Most Capable Coding Model

On February 5, 2026, OpenAI released GPT-5.3-Codex, describing it as "the most capable agentic coding model to date." The model advances both frontier coding performance and general reasoning capabilities while being 25% faster than its predecessor.

Benchmark Performance

Terminal-Bench 2.0: 77.3% - Leading all models in terminal-driven tasks SWE-Bench Pro (Public): 56.8% accuracy across four programming languages OSWorld-Verified: 64.7% - Strong computer-use capabilities Speed: 25% faster than GPT-5.2-Codex with improved token efficiency

Technical Innovations

Self-Bootstrapping Development

Remarkably, GPT-5.3-Codex was instrumental in creating itself. The Codex team used early versions to:

Debug its own training process

Manage deployment infrastructure

Diagnose and fix test results

Optimize inference performance

Enhanced Capabilities

Agentic Coding: Autonomous multi-step task execution with minimal human intervention Terminal Mastery: Native-level command line proficiency surpassing previous models Multi-Language Support: Production-grade code generation in Python, JavaScript, TypeScript, Java, C++, Go, and Rust Token Efficiency: Uses fewer output tokens while maintaining quality - reducing API costs

Security & Safety

GPT-5.3-Codex is the first OpenAI model treated as "High" under the Preparedness Framework, particularly for Cybersecurity capabilities. Enhanced safeguards prevent malicious code generation while preserving legitimate security research functionality.

Availability & Pricing

ChatGPT Users: Available now with ChatGPT Plus, Team, and Enterprise plans API Access: $10/$30 per million tokens (input/output) Platform Integration: ChatGPT app, CLI, IDE extensions, and web interface Cloud Providers: AWS Bedrock and Azure OpenAI Service (Q1 2026)

Performance Comparison

Model

Terminal-Bench

SWE-Bench Pro

Speed

Price (Input)

Codex 5.3

77.3%

56.8%

1.8s

$10/M

Claude Opus 4.6

68.4%

54.2%

3.2s

$15/M

Gemini 3 Pro

64.1%

48.3%

2.4s

$7/M

Developer Reception

Early adopters report Codex 5.3 excels at:

Backend service development

Terminal automation and DevOps tasks

High-volume code generation

Bug fixing with rapid iteration

Some developers note Claude Code still leads in:

Deep architectural reasoning

Long-context codebase understanding

UI/UX design suggestions

Use Codex 5.3 If...

Speed is critical for your workflow

Working primarily with terminal/CLI tools

Need cost-effective high-volume generation

Building backend services and APIs

Require reliable, bug-free code on first attempt

Conclusion

GPT-5.3-Codex represents a significant leap in AI coding capability, particularly for terminal-driven and autonomous agent workflows. Its combination of performance, speed, and competitive pricing makes it a compelling choice for development teams.

The model's ability to help build itself demonstrates we're entering an era where AI systems actively participate in their own development - a paradigm shift with profound implications.