The Four Stages of AI-Assisted Coding

Most developers are somewhere on the AI-assisted coding spectrum without realizing it has structure. You're either letting AI write everything unchecked, or you're obsessively reading every generated line and both extremes miss few important the point.

Here's a framework I keep coming back to: four levels that map where you are and what it takes to move up.

Level 1: Vibe Coding

AI writes. You don't look. Error? Paste it into the chat. Repeat.

No version control. No security hygiene. No understanding of what's in the codebase. You're basically a prompt jockey.

This isn't inherently wrong. For throwaway scripts, one-off prototypes, or weekend experiments, Level 1 is perfectly fine. The mistake is staying here when the work starts to matter.

The reality: you couldn't explain what any particular file does without asking the AI first.

Level 2: Agentic Coding with Discipline

This is where most professional developers using AI tools actually live — or where they should be aiming as a baseline.

Git is in the picture (branches, commits, pull requests)
You have a file-level mental model of the codebase
Basic security practices are followed
You verify the app actually works before shipping

Tools like Cursor, Windsurf, or Claude Code in agent mode make this level dramatically more productive than the old "copy-paste from Stack Overflow" workflow. But the discipline is still on you, the tooling doesn't enforce it.

The reality: you know what each file does, but you might not know what each function does.

Level 3: Agentic Software Engineering

This is where it starts becoming production-ready code, not just functional code.

Tests exist — AI-written or human-written, but you've verified they test what they're supposed to test
Pre-commit hooks handle formatting, linting, and trivial checks automatically
CI runs on every push
You have function-level understanding: not necessarily every line, but you know what each class and function does and why
Manual testing is deliberate, not accidental

The shift between Level 2 and Level 3 is mostly about systematizing quality. You stop trusting that it probably works and start building infrastructure that tells you whether it does.

The reality: when something breaks in production, you have a test to reproduce it before you fix it.

Level 4: High-Quality Software Engineering

At this level, the output is indistinguishable from code written by a strong staff engineer ideally produced faster.

What makes this level distinct isn't just more tools. It's a different quality loop:

Line-by-line review: Before marking a PR ready, every line is understood and intentional
Self-reflection prompts: "Are you sure about this? What are the tradeoffs between these two approaches?" gets the model to surface its own uncertainty and think through alternatives
AI-powered interactive code review: Not a single-pass review — you pull changes locally, make sure they run, then interrogate the AI about each file, function, and key line
Research-backed decisions: For architecture choices (Postgres vs Cassandra, monolith vs microservice, which auth pattern), you use deep research tools, check the sources, and synthesize before committing
Automated quality checks: Background agents running your CLI, browser agents testing your UI, multi-platform checks — the QA loop runs without you manually triggering it
Accelerated learning: Instead of letting AI own the understanding, you ask why. "Why did you write it this way?" and "What does this line actually do?" keeps the knowledge in your head, not just in the model's output

The reality: you could defend every architectural decision in the codebase without needing to ask the AI to explain it to you.

The Matrix

A useful shorthand for which level a given piece of work demands:

Level	Code Understanding	Quality Infrastructure
1 — Vibe coding	None required	None
2 — Agentic with discipline	File-level	Git, basic testing
3 — Agentic engineering	Function-level	CI, hooks, deliberate QA
4 — High-quality SE	Line-level	Automated loops, research-backed decisions

Be Flexible — Not Dogmatic

The goal isn't to always operate at Level 4. That would be as wrong as always staying at Level 1.

The right level depends on what you're building:

Level 1: Throwaway scripts, experiments, personal prototypes
Level 2: Internal tools, early-stage products, fast iteration
Level 3: Production services, anything with real users
Level 4: Mission-critical systems, regulated environments, large-scale codebases

A first draft of code being rough is fine. What matters is that you know how to move it up the stack before it ships.

What This Means for Salesforce Developers

In the Salesforce context where Apex runs in a managed runtime, governors matter, and deployment failures cost time operating below Level 3 is risky.

AI-generated Apex that looks plausible can fail in non-obvious ways: SOQL inside loops, missing null checks, test classes that achieve coverage without testing behavior. Level 2 won't catch these reliably. Level 3's CI culture (even if it's just sf apex run test in a scratch org pipeline) will.

For Agentforce and agentic flows, where AI is calling AI, and chains of tool calls produce downstream effects Level 4 thinking becomes non-optional. You as a human developer or architectneed to understand what each agent action does and why, not just whether it seems to work in the happy path.

The tooling is getting better fast. Claude Code with agent mode, Cursor with background tasks, automated test loops, Level 4 is becoming achievable without heroic effort. The question is whether you're building the habits to use it.

Hat tip to YK's Substack on Agentic Coding for the original framework that sparked this post.

The Four Stages of AI-Assisted Coding

Level 1: Vibe Coding

Level 2: Agentic Coding with Discipline

Level 3: Agentic Software Engineering

Level 4: High-Quality Software Engineering

The Matrix

Be Flexible — Not Dogmatic

What This Means for Salesforce Developers

Related Posts

Claude Code Hooks: Guardrails for Your Salesforce Dev Workflow

The Salesforce Replacement Pipeline: How AI Changed the Buy vs. Build Math.

Tokenmaxing Is Out: What Frugal AI Means for Salesforce Developers and Architects