Walrus Ocean Loop · peterjasko.com

A two-day experiment in using a structured AI-assisted coding loop to improve a rough Three.js game prototype.

The first prompt produced a poor result. I then let the loop ask for missing assets, added the available GLBs, and ran repeated passes focused on visuals, world liveliness, missions, controls, and code cleanup. I played the build once per day, turned issues into small tickets, and fed those back into the loop one by one.

The implementation is still prototype-grade and uneven, but the delta was meaningful: with roughly one to two hours of direct work, the prototype became a much larger underwater world with more assets, mechanics, and direction than the initial output.

What I Was Testing

This was not primarily a game project. It was a workflow test.

I wanted to see whether a coding agent could make better progress when given:

a shared design spec
a recurring five-pass improvement loop
explicit asset constraints
screenshots for visual feedback
daily human playtesting notes
issue triage before implementation
a cleanup pass after feature work

Loop Structure

The loop had five recurring passes:

Improve the visual aesthetic.
Make the underwater world feel more alive.
Expand missions, mechanics, and progression.
Improve gameplay feel, controls, and UX.
Review code quality and scalability.

After playtesting, issues were split into smaller tickets and resolved one by one instead of asking the model to make the game better in one broad pass.

Result

The final prototype is not polished, but it is substantially more coherent than the initial generation. It has a larger underwater world, more visible life, more mechanics, more assets integrated into the scene, and a clearer direction around missions and progression.

The useful part was seeing how much improvement came from better process rather than from much more manual implementation time.

Caveat

This is a lab artifact, not a production-quality game. The main takeaway was not that the agent produced a finished game. It did not. The takeaway was that agent output improved materially when the work was constrained by specs, separated into recurring passes, checked through screenshots and playtesting, and routed through issue triage instead of broad make-it-better prompts.