Testing GPT-5 in Cursor: A Week of Free Requests
After the introduction of GPT-5 and its enhanced coding capabilities paired with a week of free requests from Cursor, it was the perfect time to build out a project and test the new model. Here are my observations after working with it for the past couple days.
Workflow
The workflow that developed during testing:
1. Create PRD in ChatGPT
Started with a Product Requirements Document in ChatGPT for the new project.
2. Choose Vercel Templates
Let ChatGPT decide which template was best suited to work with and extend based on the PRD.
3. Clone Template and Open in Cursor
Standard setup process with the selected template.
4. Create PRD-Gap List
Within Cursor, created a document listing what was already present in the workspace and which elements were missing, focusing on architecture and components.
5. Create Implementation Plan
Developed a step-by-step plan to fill the gap, with each step requiring validation through Playwright MCP locally, then deployment to Vercel to identify any deployment issues.
6. Step-by-Step Implementation
Began implementing following the validation steps. Deployment issues were copied back into Cursor or checked via Vercel MCP.
7. Create Documentation Folder
Set up a docs folder with .gitignore and .vercelignore where markdown files were created to describe features and summarize architectures and patterns for future reference.
8. Create Component Testbed
Built a testbed with all basic reusable components first so that future implementations could follow consistent guidelines.
Impressions
Observations
- Instruction Following: It follows instructions well and focuses on the initial task
- Code Generation: Code is not overly verbose and changes are not too large
- Processing Time: It thinks for quite some time before getting to work
- MCP Tool Calling: Not quite as automatic - sometimes requires multiple explicit commands to use the tool itself
- Terminal Commands: Running terminal commands in Cursor still feels clunky
- Theme Installation: Tried to install a new theme which resulted in a lot of work; reverting with local history inside of Cursor did not solve it
Model Switching
There were scenarios where switching models was necessary:
Vercel Deployment Timeouts
Had to switch the app itself to use Claude because of timeouts when deployed to Vercel (300-second time limit).
Bug Fix Approach
Had to switch to Claude Opus 4.1 for another bug fix. Opus started improving logging in order to find the bug while GPT-5 kept trying the same approaches repeatedly.
Conclusion
GPT-5 in Cursor shows improvements in maintaining focus and generating appropriately-scoped code. The structured workflow that emerged from testing worked well for building new projects from templates.
The experience showed that different models have different strengths. GPT-5 worked well for standard implementation tasks, while Claude was needed for deployment constraints and certain debugging scenarios. This suggests that using multiple models for different tasks may be more effective than relying on a single model.
The week of free requests provides a good opportunity to test GPT-5's capabilities. For new projects starting from templates, the workflow described above proved effective. Having access to alternative models like Claude remains useful for specific scenarios like debugging or when working with deployment time constraints.