Testing GPT-5 in Cursor: A Week of Free Requests

After the introduction of GPT-5 and its enhanced coding capabilities paired with a week of free requests from Cursor, it was the perfect time to build out a project and test the new model. Here are my observations after working with it for the past couple days.

Workflow

The workflow that developed during testing:

1. Create PRD in ChatGPT

Started with a Product Requirements Document in ChatGPT for the new project.

2. Choose Vercel Templates

Let ChatGPT decide which template was best suited to work with and extend based on the PRD.

3. Clone Template and Open in Cursor

Standard setup process with the selected template.

4. Create PRD-Gap List

Within Cursor, created a document listing what was already present in the workspace and which elements were missing, focusing on architecture and components.

5. Create Implementation Plan

Developed a step-by-step plan to fill the gap, with each step requiring validation through Playwright MCP locally, then deployment to Vercel to identify any deployment issues.

6. Step-by-Step Implementation

Began implementing following the validation steps. Deployment issues were copied back into Cursor or checked via Vercel MCP.

7. Create Documentation Folder

Set up a docs folder with .gitignore and .vercelignore where markdown files were created to describe features and summarize architectures and patterns for future reference.

8. Create Component Testbed

Built a testbed with all basic reusable components first so that future implementations could follow consistent guidelines.

Impressions

Observations

Instruction Following: It follows instructions well and focuses on the initial task
Code Generation: Code is not overly verbose and changes are not too large
Processing Time: It thinks for quite some time before getting to work
MCP Tool Calling: Not quite as automatic - sometimes requires multiple explicit commands to use the tool itself
Terminal Commands: Running terminal commands in Cursor still feels clunky
Theme Installation: Tried to install a new theme which resulted in a lot of work; reverting with local history inside of Cursor did not solve it

Model Switching

There were scenarios where switching models was necessary:

Vercel Deployment Timeouts

Had to switch the app itself to use Claude because of timeouts when deployed to Vercel (300-second time limit).

Bug Fix Approach

Had to switch to Claude Opus 4.1 for another bug fix. Opus started improving logging in order to find the bug while GPT-5 kept trying the same approaches repeatedly.

Conclusion

GPT-5 in Cursor shows improvements in maintaining focus and generating appropriately-scoped code. The structured workflow that emerged from testing worked well for building new projects from templates.

The experience showed that different models have different strengths. GPT-5 worked well for standard implementation tasks, while Claude was needed for deployment constraints and certain debugging scenarios. This suggests that using multiple models for different tasks may be more effective than relying on a single model.

The week of free requests provides a good opportunity to test GPT-5's capabilities. For new projects starting from templates, the workflow described above proved effective. Having access to alternative models like Claude remains useful for specific scenarios like debugging or when working with deployment time constraints.