A two-day experiment in using a structured AI-assisted coding loop to improve a rough Three.js game prototype.
The first prompt produced a poor result. I then let the loop ask for missing assets, added the available GLBs, and ran repeated passes focused on visuals, world liveliness, missions, controls, and code cleanup. I played the build once per day, turned issues into small tickets, and fed those back into the loop one by one.
The implementation is still prototype-grade and uneven, but the delta was meaningful: with roughly one to two hours of direct work, the prototype became a much larger underwater world with more assets, mechanics, and direction than the initial output.
What I Was Testing
This was not primarily a game project. It was a workflow test.
I wanted to see whether a coding agent could make better progress when given:
- a shared design spec
- a recurring five-pass improvement loop
- explicit asset constraints
- screenshots for visual feedback
- daily human playtesting notes
- issue triage before implementation
- a cleanup pass after feature work
Loop Structure
The loop had five recurring passes:
- Improve the visual aesthetic.
- Make the underwater world feel more alive.
- Expand missions, mechanics, and progression.
- Improve gameplay feel, controls, and UX.
- Review code quality and scalability.
After playtesting, issues were split into smaller tickets and resolved one by one instead of asking the model to make the game better in one broad pass.
Result
The final prototype is not polished, but it is substantially more coherent than the initial generation. It has a larger underwater world, more visible life, more mechanics, more assets integrated into the scene, and a clearer direction around missions and progression.
The useful part was seeing how much improvement came from better process rather than from much more manual implementation time.
Caveat
This is a lab artifact, not a production-quality game. The main takeaway was not that the agent produced a finished game. It did not. The takeaway was that agent output improved materially when the work was constrained by specs, separated into recurring passes, checked through screenshots and playtesting, and routed through issue triage instead of broad make-it-better prompts.