ChatGPT Codex complete usage tutorial, 8-step practical operation for automatically writing code in 2026

📅 2026-05-15 11:25:25 👤 DouWen Editorial 💬 6 条评论 👁 3

ChatGPT Codex is a code agent product officially released by OpenAI in October 2025. It has been iterated to the second generation in May 2026. Different from the early ChatGPT code writing, Codex is not a simple chat code, but a complete agent that can independently run commands, submit PRs, run tests, and fix bugs in the container. This article explains how to get started from scratch in 8 practical steps.

The article assumes that readers already have a ChatGPT account and know basic git and command line operations. If you have no basic knowledge, it is recommended to first read the Notion AI or Cursor tutorials to get started, and then come back to the Codex.

The core difference between ChatGPT Codex and ordinary ChatGPT

Picture

Normal ChatGPT writing code is a single round of question and answer. You post a requirement and it will respond with a code. If you post an error, it will respond with a correction plan. Codex does not work this way. It starts a Linux container in the OpenAI cloud, clones your entire warehouse into it, and can then independently run npm install, run pytest, modify files, submit commits, and push PRs.

Simply put, Codex upgrades ChatGPT into a colleague who can "work" instead of a consultant who can only "suggest". You submit a GitHub Issue to it, and it can complete the entire process from reading the code to submitting the PR. There is no need for you to copy and paste manually.

Subscription and access thresholds

Picture

Codex is currently only available to ChatGPT Plus and Pro users. Plus $20 per month allows you to run 20 tasks per day. Pro costs $200 per month and can run 200 tasks per day. Free users cannot use Codex temporarily, which is different from GPT-4o.

If you have a team of multiple people collaborating, the ChatGPT Team plan starts at $30 per person per month. Codex task quota can be pooled and shared. Enterprise plan is priced separately and comes with SLA and SOC2 compliance.

The first step is to activate Codex in the ChatGPT sidebar

Picture

After logging in to chatgpt.com, look for the "Codex" entry at the bottom of the left column. The first click will guide you to authorize GitHub. The authorization process authorizes the codex agent to read and write the warehouse specified under your GitHub account. It is recommended to only authorize the warehouses that need to be operated and not to open them all.

After authorization, Codex automatically forks an isolated Linux container. This container is in a fresh state for each task and will not retain the previous intermediate files. The environment is installed with Node 22, Python 3.12, Go 1.23, Ruby 3.3, and Rust 1.78 by default, covering basically all mainstream languages.

The second step is to configure the project environment

Picture

Add a codex.yml configuration file to the warehouse root directory. Write two paragraphs in it. The setup section defines environment installation steps, such as npm install or poetry install. The test section defines test commands, such as npm test or pytest -v. Codex will press setup to prepare the environment, and press test to verify your changes.

If your project uses Docker, you can also specify a Dockerfile. Codex will build the container according to the Dockerfile and then run it. This method is suitable for projects with complex environments that rely on system packages. For example, libpq-dev, ffmpeg, and postgres-client are required.

Step 3: Submit the first task

Picture

Return to the Codex interface and click New Task. Write a natural language task description in the input box. For example, "Help me add signature expiration verification to the JWT verification logic in src/auth.ts, and add a unit test." After submission, Codex will first read the relevant files, think about it for a while, and then start editing.

The entire process is displayed in real time in the Codex interface. You can see which files it read, which commands it ran, and which lines it changed. If you go off track halfway, you can interrupt directly and say, "Let's first see how to use the verify function in src/middleware.ts."

Step 4: Review diff and push PR

After the task is completed, Codex will give you a complete diff view. Each file change is displayed in GitHub style, red minus green plus. You can confirm hunk one by one. If there is anything you disagree with, you can edit it directly or ask for it to be redone.

When you are satisfied, click Create PR. Codex will automatically push to a new branch and open a PR on GitHub. Change descriptions, associated Issues, and run test results will automatically be written in the PR description. The entire process takes 5 to 30 minutes for a complete PR, which is normal for medium-complexity tasks.

Types of tasks Codex excels at

According to actual tests, Codex is best at the following types of tasks. The first type is supplementary unit testing, which can directly generate a complete test file by giving it an uncovered function. The second category is refactoring, such as changing callback to async/await and changing class components into function components. The third category is to fix lint errors. Run lint once and silently fix all warnings.

The fourth category is document modification, which can reversely generate README based on the code. The fifth category is dependency upgrade. Cross-version migration such as upgrading React 17 to 18 can basically be 80% completed by running Codex once.

Types of tasks that Codex is not good at

Codex also has clear shortcomings. The first category is tasks that require a lot of business context, such as "change to a new payment process according to business rules." It doesn’t know your company’s payment background. The second category involves front-end UI vision. You can't see the page rendering results and can only guess.

The third category is performance optimization, which requires profiling data that cannot be obtained from Codex. The fourth category is large-scale architecture adjustments. Codex changes that span 50 files can easily lose context. It is recommended to break large tasks into 5 to 10 small tasks and give them to Codex respectively.

Comparison between actual measurement and Cursor Claude Code

Cursor is an AI assistant built into the IDE, focusing on "quick conversational code modification". Codex is an asynchronous agent, focusing on "throwing a task and then collecting the PR". The two usages are different and do not conflict. They are used complementaryly. During development, I switched to using Cursor while writing, and before leaving work, I threw a few tasks to Codex and let it run overnight.

Claude Code is a command line agent produced by Anthropic, and its capabilities are close to those of Codex. The difference is that Claude Code runs on your local terminal, and Codex runs on the OpenAI cloud. Local privacy is good, but you have to install Docker yourself. The cloud is convenient, but the code needs to be shown to OpenAI. In May 2026, enterprise users use Codex more, while individual users prefer Claude Code.

Safety and Cost Considerations

Codex runs in the OpenAI cloud container. The code will be read and analyzed but will not be used for training by default. You can turn off "Improve the model" in Settings to avoid training. However, it is recommended to run sensitive enterprise code locally with Claude Code or build a self-built OpenAI Enterprise private deployment.

In terms of cost, each Codex task consumes an average of 50,000 to 200,000 tokens, and is priced at approximately $0.1 to $0.5 per task based on GPT-4o. 200 tasks per day under the Pro plan is equivalent to $100, which is not cheap. It is recommended that Codex give priority to highly repetitive tasks with the greatest value.

Teamwork and best practices

Codex performs better in team scenarios than solo. Each engineer throws 1 or 2 daily trivial tasks to Codex, and Code Review handles them centrally. A team of 5 people can produce about 30% more code in a week. The point is not to replace people, but to let people focus on making architectural decisions and complex business logic.

Best practices include making task descriptions as specific as possible, and giving file paths and function names is twice as accurate as giving abstract descriptions. Limit each task to a single goal and do not "fix bugs and refactor at the same time", otherwise the PR will get out of control. Use the PR template to let Codex automatically fill in test coverage and changelog. Important points: Required Review does not allow Codex PR to be automatically merged.

In terms of team governance, it is recommended that all Codex submissions be marked with a commit message prefix such as codex or bot to facilitate auditing and subsequent traceability. Monthly Statistics Codex PR pass rate, if it is less than 60%, it means there is a problem with the task definition and needs to be adjusted.

Iteration direction for the next year

OpenAI’s internal Roadmap revealed that Codex will add several capabilities in the second half of 2026. The first is multi-warehouse linkage. One task can change code across 3 to 5 warehouses, such as changing the front-end and back-end SDK at the same time. The second is stronger code understanding, which can understand the global context of a large monorepo with a level of 1 million lines.

The third is deep integration with the main chat of ChatGPT. In ordinary chat, you can directly say "Help me open a PR for this requirement" and it will automatically jump to the Codex to complete. The fourth is to support local IDE synchronization preview, allowing Cursor VSCode users to view the Codex remote container status in the IDE.

Anthropic Claude Code and Google Jules are also working on similar products. It is expected that there will be 3 to 4 stable code agent services in the market in 2027, each occupying niche scenarios. OpenAI Codex is still the first choice due to its ecological advantages.

FAQ

Is Codex free?

Not free. Codex is an add-on feature for ChatGPT Plus and Pro subscriptions. Plus $20 per month for 20 tasks per day, Pro $200 per month for 200 tasks per day. Codex is temporarily unavailable to free users. OpenAI says it may offer free trials in the future but there is no timetable at this time.

Is the code modified by Codex safe and trustworthy?

Code modified by Codex must go through your manual review before being merged. The diff it generates will not be automatically pushed to main by default, and all will go through the PR process. However, the generated code occasionally has bugs, security vulnerabilities, and API misuse. It is recommended that important projects cooperate with the Code Review tool for secondary review, and do not blindly cooperate with Codex PRs.

Which is better, Codex or Cursor?

The two are positioned differently and cannot be compared directly. Cursor is a quick conversation assistant in the IDE, suitable for writing and modifying during development. Codex is a cloud agent, suitable for throwing tasks asynchronously and other results. The two are used complementaryly. Cursor is used for development, and Codex is used for running overnight tasks at home. If you have enough budget, it would be most efficient to book both.

Does Codex support Chinese communication?

support. Task descriptions can be written in Chinese, and Codex fully understands them. However, it is recommended to use English comments on the code itself, because Codex performs more stably with more training data on English code. Chinese variable names are also available but the downstream tool chain may not be compatible. It is recommended that the task description be in Chinese and the code itself be in English.

Can Codex change private company codes?

Can. The GitHub authorization process supports private repositories. However, the code will be uploaded to the OpenAI cloud container, and compliance risks must be considered for sensitive commercial code. The OpenAI Enterprise program offers SOC 2 Type II certification and data not used for training commitments. If it is a highly regulated industry such as finance and medical care, it is recommended to confirm with the legal department before using it.

📝 本文来自抖文 www.douwen.me ,转载请保留出处。

💬 评论 (6)

R
ResearcherJ 2026-05-15 11:18 回复

Practical tips not fluff.

D
DevTools 2026-05-15 03:12 回复

Stats really back it up.

D
DevTools 2026-05-14 21:01 回复

Clear and to the point.

D
DevTools 2026-05-15 08:20 回复

Loved the FAQ section.

T
TechReader 2026-05-15 06:13 回复

Step-by-step is gold.

S
SEOFan 2026-05-15 02:19 回复

Best summary I've read on this.