Home/Nate Herk | AI Automation/Codex vs Claude Code

Codex vs Claude Code

Nate Herk | AI Automation · 47 Claims

Neutral

OpenAI went from being the biggest AI company to becoming something kind of mid.

Author states this as a general perception, not backed by specific metrics.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

People who used AI to code basically forgot OpenAI existed, thanks to tools like Claude Code.

Author describes user behavior shift away from OpenAI for coding due to competition.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Over the past few weeks, many videos claim OpenAI Codex is actually better than Claude Code.

Author reports seeing this trend in content, setting up the comparison.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

I've been trying Codex for the past month, and honestly, the results have been really impressive.

Personal endorsement of Codex's performance after a month of use.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

The comparison will focus on features, price, and three specific use cases to determine which is better.

Author outlines the evaluation framework for the video.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

The real question is which tool gives a better workflow for how you want to work, not just feature checklists.

Author sets the philosophy for the comparison.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Claude Code is Anthropic's coding agent, available in terminal, VS Code extension, and a full desktop app for Mac and Windows.

Factual description of Claude Code's interfaces.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Claude Code can run Opus (currently smartest model), Sonnet, or Haiku; Opus and Sonnet are top-tier for coding.

Factual statement about the underlying models and their capability level.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Claude Code is highly customizable, feeling more like a workflow system than a tool, with skills, hooks, and sub-agents.

Author expresses personal liking for its customizability and positions it as a flexible system.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Claude Code has around 30 hook events, while Codex has about six, giving Claude roughly 5x the granularity for automated behavior triggers.

Quantitative comparison based on documentation.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Claude Code can automatically spawn sub-agents when a task requires it, whereas Codex only spawns them if explicitly asked.

Author notes this as a powerful default behavior for complex tasks.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Claude Code offers /ultra plan and /ultra review (both in research preview), enabling cloud-based planning and multi-agent code reviews with inline comments.

Describes advanced features available to power users.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Claude Code's /loop command can run recurring prompts or maintenance mode to keep projects tidy on a schedule.

Highlights /loop as a useful automation feature.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Claude Code's channels feature allows receiving external events from Telegram, Discord, or iMessage into a running session.

Factual description of a unique capability.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

The Claude agent SDK exposes the engine as a Python and TypeScript SDK for building custom agents.

Factual statement about developer tooling.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Claude code supports enterprise off with Bedrock, Vertex AI, and Microsoft Foundry; Codex lacks this level of deployment flexibility.

Factual comparison of enterprise hosting options.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Codex is OpenAI's new full agentic coding system (not the retired 2021 model), available in terminal, desktop app for Mac/Windows, VS Code extension (works with Cursor), and cloud.

Factual description of Codex availability and clarification of its identity.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Codex runs on GPT family models including GPT-Codex and GPT-Codex-Spark (the latter in research preview for pro users).

Factual statement about model architecture.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Codex feels like an opinionated machine designed to take you all the way to code shipped to production, with built-in Git work trees for parallel tasks.

Author highlights Codex's end-to-end shipping focus and native work tree support as a standout.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Codex’s built-in Git work trees make parallel task management feel native, enabling a full shipping pipeline within the desktop app.

Author praises the seamless integration of work trees for end-to-end shipping.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Codex desktop app has an in-app browser to preview shipped work, allowing visual comments directly on the page; this is cleaner than switching to Chrome (though Claude has a browser extension called Claude in Chrome).

Author finds the integrated browser slightly better for the desktop app experience.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Codex’s computer use feature for product QA can open the app, click around, find bugs, and log them with severity ratings, expected vs actual behavior, steps to reproduce, and a triage summary.

Describes a polished first-party QA flow not yet built out in Claude Code.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Codex offers a GitHub integration with @Codex mention that spins up a cloud sandbox to handle PR comments or issues with zero setup.

Highlights a seamless, low-configuration feature not found as a first-party flow in Claude Code.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Codex has an experimental /goal command (behind a feature flag) that grinds on a long-running objective until a verifiable stopping condition is met.

Notes that while similar functionality exists in Claude via /loop, Codex packages it as a clean native command.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Codex includes GPT Image 2, one of the strongest image generation models, directly in the app, while Anthropic has no image generation model.

Pointing out a unique built-in capability for projects needing image generation.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Codex is included in every paid and free ChatGPT plan (Free, Plus, Pro, Business, Enterprise), whereas Claude Code cannot be used for free.

Factual comparison of access plans.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Claude Code is part of Claude Pro ($20/mo), Claude Max 5X ($100/mo), and Claude Max 20X ($200/mo) subscription plans.

Factual pricing information.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Codex is included with ChatGPT free, ChatGPT Plus ($20/mo), and ChatGPT Pro ($200/mo) for unlimited use; currently the $100 tier gets 2X Codex usage through May 31st (promo).

Factual pricing and promotional details.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Sam Altman publicly tweeted on May 2 that users can sign into Open Claw with a ChatGPT subscription, endorsing third-party use of Codex subscriptions.

Reports a specific factual endorsement and its date.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Anthropic’s Claude agent SDK documentation states that using a Claude subscription inside third-party tools like Open Claw or Hermes is not allowed unless specifically approved.

Quoting the policy, which impacts economics for users of those tools.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Claude Code can run Opus and Sonnet with a 1‑million‑token context window, while the latest GPT model in Codex runs at about 256,000 tokens.

Quantitative comparison of context window sizes.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Many users report hitting Claude Code session or weekly limits much faster than before, and author’s live tests show he can do more work in Codex before hitting limits.

Based on community feedback and personal token usage observations.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

In live tests, Claude Code built an interactive dashboard in under 2 minutes, roughly 4× faster than Codex (nearly 8 minutes).

Measured performance from side‑by‑side prompt experiment.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

On the same dashboard build, Claude used about 283K tokens total, while Codex used about 1.64M tokens — nearly 6× more.

Token usage data from the experiment.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Claude’s dashboard was visually superior: dark mode, working date filters, cleaner hover states, more polished design.

Subjective quality judgment from the live visual comparison.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Claude’s landing page forgot the logo and had wrong icons, but the underlying design base was preferred.

Weighs fixable mistakes against better foundational design.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Claude plans tasks tightly before executing, while Codex tends to grind through more iterations, causing higher input token usage on complex builds.

Observed operational pattern from token analysis.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

For front‑end work requiring interactivity and design polish, Claude was the clear winner.

Conclusion drawn from all live tests.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

On the research‑heavy PDF report, Codex finished in ~8 minutes vs Claude’s 8m 15s, and used 2.8M tokens vs Claude’s 4.7M, being faster and more efficient.

Measured performance where Codex excelled in speed and token efficiency.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Codex was significantly faster on the landing page build (3 minutes vs 4 minutes 39 seconds).

Pure speed comparison from the same prompt.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Across all three tests, Codex wrote 2‑5× fewer output tokens than Claude, making it more concise and likely why session limits are hit slower.

Token output efficiency pattern observed and quantified.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Codex’s PDF research report had better spacing and the author would send it to a client over Claude’s version by a small margin.

Subjective preference based on visual output.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Claude Code is recommended for complex front‑end, visual design quality, deep planning, auto‑delegation, custom workflows (hooks/skills/channels), embedding agents via SDK, and enterprise off.

Author’s explicit use‑case guidance for Claude.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Agree

Codex is recommended for research‑heavy tasks, structured documents (PDFs/reports), a single app with work trees and shipping, /goal for long objectives, @Codex on GitHub PRs, and built‑in image generation.

Author’s explicit use‑case guidance for Codex.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Claude Code feels more creative, better at brainstorming, and pushes back on wrong paths; Codex follows instructions obediently and is sharp at code review and catching bugs.

Subjective personal experience after extensive use of both tools.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

Users are not locked into one tool because projects are just files and folders; moving between Cloud Code and Codex is straightforward by letting the new agent understand the project.

Practical advice about portability of coding projects.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)

Neutral

The video was recorded in mid‑May 2026 and the landscape is evolving rapidly, so some specifics may change within a few months.

Time‑stamp disclaimer for future viewers.

Source: 100 Hours Testing Claude Code vs ChatGPT Codex (honest results)