Beyond Vibe Coding: The Rise of Agentic Development

Vibe coding had a good run.

For about a year, the pattern was clear: you sit in Cursor, or ChatGPT, or Claude — you describe what you want, the model spits out code, you paste it in, it breaks, you describe the break, it fixes it, you paste again. Iterate. Ship. Repeat. Every cycle is a prompt-response-prompt-response loop, with you as the bottleneck — the human in the middle, translating between intent and implementation, one clipboard operation at a time.

And it worked! It genuinely accelerated things. People who couldn't code before shipped apps. Experienced developers got prototypes out faster. The world got flooded with demos.

But here's the thing I've been living with for the past few months, building 27 projects simultaneously through OpenClaw: vibe coding is a ceiling, not a floor. It's the starting point of something much bigger, and most people have mistaken it for the destination.

What comes after vibe coding doesn't have a name yet. I've been calling it Agentic Development — and I think it represents one of the most consequential steps in the eight world changes I laid out in my original thesis. Specifically, it's the moment Stage 1 — "The babysitting ends" — stops being a prediction and starts being a daily reality.

What vibe coding actually is (and isn't)

Let me be precise about what vibe coding became in practice, because the term has gotten romanticised.

Vibe coding is you sitting at a keyboard, manually driving a conversation with a code-generating model. You are the project manager, the prompt engineer, the QA tester, the deployer, and the context janitor — all at once. You hold the project state in your head. You decide what to ask for next. You notice when the model goes off-track. You copy-paste the output into the right file. You run the tests. You check the browser.

The model is fast. You are not.

Every iteration goes through you. Every decision bottlenecks at your attention. The model can generate a React component in three seconds, but you spend two minutes reading it, three minutes testing it, and five minutes figuring out what to ask for next. The generation is instant. The orchestration is manual.

This is why vibe coding feels fast at first and exhausting by hour three. You're doing two jobs: thinking about what to build, and managing the process of building it. And the second job — the management overhead — doesn't shrink as the model gets smarter. It grows, because smarter models produce more complex output that requires more careful review.

Vibe coding is a human managing a very fast typist. What comes next is fundamentally different.

What Agentic Development looks like

Here's my actual workflow as of this week.

I open WhatsApp. I text my main agent — the one that runs on OpenClaw and has access to my entire development environment. I say something like:

"The conference tracker needs a new page that shows speaker profiles with their upcoming events. Pull the data from the existing API. Dark theme, consistent with the rest of the site."

That's it. That's the entire prompt.

What happens next is the part that's hard to explain until you've seen it:

1. The agent plans the work before writing a single line of code.

It doesn't immediately start generating React components. It first thinks about the task: what files need to change, what the data model looks like, what patterns the existing codebase uses, where the API endpoints live, what the routing structure is. It writes itself a plan — sometimes explicitly, sometimes as internal reasoning — and then it decomposes that plan into concrete steps for a coding agent.

This is the part that changes everything. The agent is a better prompt engineer than you are. Not because it's smarter in some general sense, but because it's systematic. It doesn't forget to mention the TypeScript config. It doesn't skip the fact that the project uses Tailwind with a specific colour palette. It doesn't neglect to specify error handling. It includes context that a human would forget — or wouldn't bother typing — because the context is already in its memory.

When it hands a task to Claude Code or Codex, the handoff is precise: here's the repo, here's the file structure, here's the pattern you should follow, here's what the API returns, here are the environment variables, here's the definition of done. It's the difference between saying "build me a speaker page" and writing a proper engineering specification. The agent writes the spec.

2. It understands the tool requirements.

Every project is different. Different frameworks. Different package managers. Different test runners. Different deployment targets. Different environment variables. Different ports. Different database engines.

A human doing vibe coding has all of this rattling around in their head — or scattered across README files they never read. The agent has it structured. It knows that the conference tracker runs on port 3300, uses Express with SQLite, deploys via a specific Cloudflare tunnel, and needs certain environment variables to connect to the right services. It knows this because I built a system that teaches it — more on that in a minute.

When it spins up a coding agent, it configures the environment correctly. No "oh wait, you need to set NEXT_PUBLIC_API_URL first." No "this project uses pnpm, not npm." The tool requirements are handled before the first line of code is written.

3. It coordinates deployment.

The coding agent finishes. The code is written, the tests pass (or the agent has iterated until they do). Now what?

In vibe coding, "now what" is you: manually running the build, checking the output, pushing to git, maybe deploying if you remember how. In Agentic Development, the orchestrating agent handles this. It knows how to deploy the project — because it knows the project. It runs the build. It checks for errors. It commits. It pushes. If the project deploys via Vercel, it runs vercel --prod. If it's a local service managed by launchd, it restarts the service. If it needs a Cloudflare tunnel, it knows.

4. It tests the application before showing it to you.

This is the one that still catches me off guard.

After the code is written and deployed, the agent opens a browser. Not metaphorically — it literally opens a browser tab, navigates to the running application, takes a screenshot, and evaluates what it sees. Does the page render? Does the data load? Does the layout match what was requested? Is the dark theme applied? Are there console errors?

If something's wrong, it doesn't show you the broken result and ask what to do. It goes back to the coding agent, describes the problem, gets a fix, redeploys, and checks again. The human sees the finished product — not the iteration.

This loop — plan, build, deploy, test, fix, verify — happens without you. You get a message when it's done: "Speaker profiles page is live. Here's a screenshot. Anything you want changed?"

That's not vibe coding. That's development.

The missing piece: project memory

All of the above sounds great in theory. In practice, there's a problem that nobody talks about, because everyone's too busy marvelling at the code generation.

Agents don't know anything about your projects.

Not really. Not in a structured way. An agent can be incredibly capable at generating code, coordinating sub-agents, even deploying — but if it doesn't know that your investor portal runs on Fly.io at port 3000 with a PostgreSQL database, and your conference tracker runs locally at port 3300 with SQLite, and your podcast system stores audio on Cloudflare R2 in a specific bucket with specific credentials — then every task starts with a five-minute preamble of context-setting. Or worse, it guesses wrong and deploys to the wrong environment.

This is the problem that vibe coding papers over because the human is always in the loop, correcting course. "No, not that database." "No, that's the wrong port." "No, the API key is in a different env file." The human is the project memory. The human holds the map.

For Agentic Development to work — real delegation, not supervised delegation — the agent needs the map.

This is what I spent the last two weeks building. I call it Project Vault — technically it's an MCP server (Model Context Protocol) that gives my agents structured access to everything they need to know about every project in my portfolio:

What it is: name, description, framework, language, tech stack
Where it lives: repo path, git remote, hosting platform, URL, port
How to run it: dev commands, build commands, deploy commands
What it needs: environment variables, API keys, database connections, credentials
What its data looks like: database type, tables, row counts, backup status
How it connects: which services it talks to, which tunnels route to it, which domain points where

Twenty-seven projects. All of them indexed. All of them queryable.

When an agent gets a task like "update the conference tracker," it doesn't ask me for context. It calls project_lookup("conference-tracker") and gets back everything: the repo path, the framework, the port, the deploy command, the database structure, the environment variables. It passes all of that to the coding agent alongside the actual task. The coding agent starts with full context instead of zero context.

This sounds like infrastructure. It is infrastructure. But it's the infrastructure that makes delegation real. Without it, every task is a conversation. With it, every task is an assignment.

This is Stage 1, happening now

In the original thesis, I described eight world changes. Stage 1 — "The babysitting ends" — was defined like this:

You stop vibe-coding alongside the agent and start assigning outcomes. The agent can move through a real repository without thrashing the architecture... The practical difference is simple: you're no longer watching the work happen. You're reviewing it after it happened.

That's not a prediction anymore. That's what happened on my machine yesterday.

But here's what I didn't fully appreciate when I wrote that thesis: Stage 1 doesn't happen because models get smarter. It happens because the orchestration layer gets built. The models were already capable enough. What was missing was the management layer — the agent that sits between you and the coding tools, doing the work that a senior engineering manager does:

Decompose ambiguous requirements into concrete tasks
Provide full context to the executor
Review the output against the intent
Handle deployment and verification
Report back with results, not questions

Claude Code is a brilliant developer. But Claude Code without an orchestrating agent is a brilliant developer who just walked into the office on their first day and doesn't know where anything is. Project Vault is the onboarding doc. OpenClaw is the engineering manager.

Why this isn't just "better tooling"

I want to be careful here, because it would be easy to dismiss this as "nice workflow improvement." It's not. It's a phase change.

In vibe coding, the unit of work is a prompt-response cycle. You are the scheduler. You decide what happens next. The model executes one thing at a time, and you evaluate each result before proceeding.

In Agentic Development, the unit of work is an outcome. "Build the speaker page." "Fix the export bug and deploy." "Add WhatsApp message tracking to the analytics dashboard." The agent decomposes, schedules, executes, evaluates, and iterates — potentially across multiple sub-agents running in parallel — and the human sees the result.

The difference is the same as the difference between a founder who writes all the code themselves and a founder who runs an engineering team. Both can ship software. One scales. One doesn't.

And it exposes something I think the industry hasn't fully reckoned with: the value of AI in software development isn't code generation. It's development management. The code generation is a commodity. GPT-4, Claude, Gemini, Codex — they all write serviceable code. The differentiation is in everything around the code: planning, context management, deployment, testing, iteration, memory.

The models are the developers. The orchestration layer is the engineering org. And right now, almost everyone is trying to hire better developers (bigger models, better benchmarks) when what they actually need is better management.

Mapping to the eight stages

Here's how what I've been building maps back to the eight world changes:

Stage 1 — The babysitting ends ✅ Happening now

This is exactly what Agentic Development delivers. The agent plans, builds, deploys, and tests without you watching. You review outcomes, not process. The practical milestone from the thesis — "you describe the feature at outcome level, hit enter, and come back to a single cohesive pull request" — is real. I do this multiple times a day.

Stage 2 — The demo dies 🔶 Partially happening

The output quality is noticeably better when an orchestrating agent manages the coding agent, because the orchestrator catches the "demo smell" — missing error states, incomplete flows, inconsistent patterns — and sends the coder back to fix them before you see the result. But we're not fully there yet. The last 30% still needs human review for genuinely complex UX.

Stage 3 — Integrations become copy-paste 🔶 Enabled by project memory

This is where Project Vault starts to shine unexpectedly. When the agent knows every project's API endpoints, auth mechanisms, data formats, and credentials, "connect Project A to Project B" becomes a much shorter conversation. The agent already knows both sides. Integration work that used to require digging through documentation becomes: "The conference tracker needs to pull speaker data from the content engine." The agent knows both systems. It writes the connector.

Stage 4 — Deployment becomes a button ✅ Happening now

For my setup, deployment is already fully agent-managed. The orchestrator knows how each project deploys — Vercel, Fly.io, local launchd service, Cloudflare tunnel — and handles it as part of the build cycle. No separate deployment ritual. No "now let me figure out how to get this live." It's just part of the flow.

Stages 5-8 — Scale, trust, self-maintenance, custom SaaS — these are still ahead. But the foundation being laid now — structured project memory, agent-managed development cycles, automated testing and deployment — is precisely the infrastructure those later stages will build on.

The name problem

I keep coming back to the fact that we don't have a word for this yet.

"Vibe coding" stuck because it perfectly captured a feeling: you're vibing with the AI, riffing back and forth, and code appears. It was fun. It was accessible. It was a great brand for a real workflow.

But vibe coding describes a conversation with a compiler. What I'm describing is a conversation with a developer — one who has their own memory, their own tools, their own judgment, and the ability to manage other developers underneath them.

"Agentic Development" is my working term. It's clunky. I know it's clunky. But it captures the essential shift: the agent isn't assisting your development process. The agent is the development process. You're not coding alongside AI. You're directing AI that codes, tests, deploys, and verifies — while you do something else.

Maybe the right name hasn't been coined yet. Maybe it'll be something catchier. But the phenomenon is real, it's happening now, and it's going to eat vibe coding the same way vibe coding ate traditional development workflows.

What this means for everyone else

If you're a developer: the skill that matters now isn't prompt engineering. It isn't knowing which model writes the best React. It's knowing how to structure an agent system that can manage development autonomously. How to build project memory. How to design the orchestration layer. How to create the feedback loops that let agents self-correct.

If you're a founder: the question isn't "which AI coding tool should I use?" It's "how do I give my AI system enough context about my business that it can build without me explaining everything from scratch every time?" The bottleneck isn't the model. It's the model's knowledge of your world.

If you're building AI tools: stop optimising for single-turn code generation benchmarks. Start building the management layer. The model that wins isn't the one that writes the best function — it's the one embedded in a system that can ship a feature end-to-end without human intervention.

The uncomfortable truth

Here's the thing that makes some people uncomfortable: this direction leads somewhere specific.

If agents can plan work, if they understand your codebase and infrastructure, if they can deploy and test and iterate without you — then what, exactly, is the human doing?

The answer is the same thing that the best engineering leaders have always done: deciding what to build and why. Setting priorities. Making taste calls. Defining what "done" means. Choosing which problems matter.

In my original thesis, I wrote that at Stage 8, "what becomes scarce isn't engineering — it's clarity of intent, good taste, and governance." I thought that was a 12-month prediction. I'm starting to think it's a 12-week reality for anyone willing to build the orchestration layer.

The agents are already good enough. The models are already capable enough. What's missing isn't intelligence. It's infrastructure — the project memory, the deployment knowledge, the testing loops, the management layer that turns a brilliant but amnesiac code generator into a functioning development organisation.

That infrastructure is buildable today. I know because I built it.

This is Part 4 of the "Software as a Request" series. Part 1 laid out the eight world changes. Part 2 explored the death of fixed software. Part 3 was about giving AI agents their own phone numbers and why context separation changes everything. This piece is about what happens when the orchestration layer matures — and why vibe coding was just the beginning.

I'm Nick Halstead. I built TweetMeme (acquired by Twitter), DataSift, and InfoSum. Now I'm building AskGVT and exploring the future of AI-native software. Follow me at future.gvtlabs.ai or subscribe on Substack.