Datahenge logo Datahenge
Engineering

"Scribe Coding" - Writing document-driven apps with AI

Brian Pond
#ai#documentation#development#llm

Introduction

As an Enterprise Resource Planning (ERP) implementation engineer, you learn early that:

  1. Documentation is crucial.
  2. Documentation happens before you begin doing the work.

Implementation projects begin with discovery. Then you continue with an intense period of writing. What your client expects and asked for. Their current pain points. What you discovered about their current software and business processes. You perform a Fit-Gap analysis. You document what software to write, how you’ll build it, and (often forgotten) how you’ll test it.

The final product is your “playbook” for the many months ahead; a manual that guides every step of your installation, configuration, and implementation of the new ERP software platform.

So when I started working with AI, I felt a natural pull toward this approach: What if I went deep on documentation before the first line of code was written? Yet assuming I did that, how could I ensure the AI agent followed my vision?

Less Vibe, More Scribe

When starting an AI coding project with a Documentation-Driven approach, I already have a clear vision of the final software product I want. I’m familiar with fundamentals, have a storyboard in my head, and have previously developed software in that space. So I’m not trying to build something I have no experience with: a mobile game, 3D-modeling engine, or “everything app” like Notion.

Instead I’m building things like:

Certainly these aren’t the sexiest apps in the world. But they’re things that I want, or my clients want.

Historically when I wrote such things, I would grind away for months. I’d slowly build pieces of the app, glue them together, and most-likely need to refactor several times. It generally worked out … but was quite a slog.

Today I can partner with my AI agent, and speed things up substantially. It all begins with writing a ton of markdown documents…

1. Setting the Ground Rules

Before you begin writing the project requirements, it’s critical to rein in the AI agent. I used Cursor for my last two projects. The first rule:

This project aims to follow the principle of Documentation-Driven Development.

Seems kind of surprisingly simple, but there’s no point dancing around the concept: give the AI a name for what’s happening.

Next, I explained there would be a Table Of Contents for all the requirements:

### Requirements Rule
This document (link) is the Table Of Contents for all requirements documents we will write together.

With the above in place, I could now get into details:

## Documentation-First Principle
Before writing ANY code, the AI must:
* Read relevant documentation from the `docs/requirements` folder using @docs
* Use the requirements rule (above) for the requirements index and which docs to open.
* **Confirm**: Before coding, state which doc(s) and sections (or BR-XXX-NNN) the change implements.
* **Ask**: If a requirement is missing, ambiguous, or conflicts with another doc, ask the user; do not infer behavior.
* Generate code that strictly adheres to documented specifications.

Note I said above “Read ‘relevant’ documentation …” That was important — I’ll come back to it in the Managing Context section below.

I also added a ‘Never Assume’ clause:

## Never Assume
- If a requirement is not documented, ask the user before implementing
- If documentation conflicts, point out the conflict
- If edge cases aren't covered, propose solutions and document them
- **Updating docs**: When you change design, APIs, or business rules, update the relevant docs under **docs/requirements/** (and CHANGELOG if applicable) so docs remain the single source of truth.

CHANGELOG

That bit above about the CHANGELOG? The AI agent came up with that concept itself. Whenever we altered requirements, the alterations were appended to a text file named CHANGELOG. The AI agent would use this history to reconcile conflicts, kind of like git, but much simpler.

Beyond documentation rules, I added additional technical rules too. For example:

* Whenever altering packages in `package.json` or `pyproject.toml`, write documentation in `project_dependencies.md` that explains the purpose of the library or add-on.
* This project targets **Python 3.11**. Do not use Python 3.12+ features.

Both save me from headaches later — what was the point of the foo_bar_baz library? Are we accidentally using Python 3.12+ syntax?

Managing Context

AI agents have a limited memory per conversation. If you fill it up, then things could start falling through the cracks. Also, my bank account has its limits: I’d prefer to be thrifty with AI token expenses.

One approach I took was creating many focused documents instead of a few giant ones:

00-overview.md
01-functional.md
06-api-specification.md
16-api-shopping-cart.md
...and so on

Why many small documents and not a few big ones? Two reasons.

1. The AI only loads what it needs.

If I’m working on the Shopping Cart API? Then the AI agent only needs to load 06-api-specification.md and 16-api-shopping-cart.md. That’s it. It doesn’t need to know about stock algorithms, or the Google Sheets integration, or the data logging format. The less irrelevant stuff it’s carrying, the more room there is for the actual work.

2. It strongly binds the code and requirements.

As code is written, it’s associated with one or two specific requirement docs. So I can examine any function and see precisely which document drove it. This also keeps the AI focused: small, targeted documents + small scope = less drift.

2. Writing the Requirements (Together)

This is where it really begins. Not with code, but with conversation and documentation.

I sit down with the AI and begin talking about the project. At first just conceptually: what is the gizmo we’re going to build? What are the primary goals and outcomes? A paragraph identifying the key features. We paint a big picture before diving into details. Then gradually, I start to get more specific. We’ll write about the data models. What does the API look like? How does the business logic actually work?

It’s a back-and-forth dialog, almost like an interview. The AI asks questions, I answer. It writes a draft of a section, I read it carefully and push back whenever it gets something wrong or omits important details. For a large project, this can take hours — even days.

While the AI is writing docs, I’m doing critical thinking. Reading what it produces. Checking for mistakes. Asking myself: does this actually represent my vision? Are there ways this could be more modular, or scale up in the future?

As the requirements take shape, we assign business rule identifiers; tags like BR-LIB-003 or BR-WH-001. Each one labels a specific, documented business requirement. Later when the AI writes code, it references these identifiers. We can examine any function or class, and trace it directly back to the requirement that drove it. No guesswork necessary.

And then we iterate. A lot.

I’ll ask the AI to re-read the entire set of requirements documents from scratch — sometimes several times over. Each pass with a different focus:

This part of the process takes time and patience. But it pays off later, when the AI is writing code: you won’t need to stop every few minutes to answer questions. Nor will it make assumptions on its own.

3. Living Documentation

Here’s something that separates this approach from a typical “write docs once, forget them” project: the documentation stays current.

Whenever we change a design decision — a new API field, a revised business rule, a renamed concept — the AI is instructed to update the relevant requirement documents too. The rule is explicit:

When you change design, APIs, or business rules, update the relevant docs
so they remain the single source of truth.

That CHANGELOG I mentioned in the Ground Rules? It earns its keep here. Whenever something was decided one way, then revised later, the reasoning is recorded. It’s enough for the AI to reconcile conflicts without bothering me. Not nearly as powerful as git, but sufficient.

The result is that six months from now, the docs will accurately describe the current software code. Not my original design, but what it actually does, right now.

4. Code: One Module at a Time

Only once the documentation is rock solid will I move to the next step of writing code.

The AI already knows what to build — the requirements are written, the rules are in place, the business logic is documented. It doesn’t need to ask many questions. It just builds.

However, I make sure we’re only tackling one small module at a time. That way I can review it closely and catch mistakes before they compound. If there are problems, I might add more rules or alter documentation.

Once each module’s code is finished, we write Unit and Integration Tests and run them. We fix what’s broken. Then test again. Only when it’s working correctly do we move on to the next module.

This prevents half-finished, forgotten features from accumulating. There’s no “we’ll come back and fix that later.” Each module is complete before the next one starts.

Internal Documentation vs. External Documentation

Here’s something I didn’t anticipate: the AI started leaking internal details into places they didn’t belong.

My requirements documents are full of internal identifiers — things like BR-LIB-003 or “please see related docs/requirements/09-api-shopping-cart.md”. Very useful for myself and the AI. But completely meaningless to a client developer browsing my library’s API documentation.

The problem is that I have two audiences:

When I noticed my Swagger UI filling up with internal references, I added a new rule:

If you write APIs, then Swagger needs developer-friendly descriptions.
The internal requirement numbers should not be exposed to external users.

So now there’s a clean split. The Swagger decorator gets the client-facing description:

@router.post(
    "/v1/books",
    summary="Create a new book",
    description="Adds a new book to the library.  book_id must be unique."
)

And the docstring gets the internal notes:

def create_book(...):
    """
    Create a new book record.

    Implements BR-LIB-001 from docs/requirements/16-api-books.md.
    Returns 409 if book_id already exists.
    """

So now external clients see appropriate, professional API docs. But the AI agent and I see exactly where the code came from.

5. Tests

Tests get the same treatment as code: they’re tied directly to the requirements.

Because the AI already knows exactly what behavior is expected — it helped write the docs — it can write most tests itself. I don’t need to re-explain what to validate. The requirement documents clearly state what the function must do; the test verifies that it does it. Each test references the same business rule identifier as the code it’s testing.

Getting solid unit and integration test coverage costs very little extra effort at this point. Everything the AI needs to write them was already done. The only exception is sample/demo data, in which case I ask the AI to write some “seed” functions to generate data for testing.

6. User-Facing Documentation

Having completed documentation + code + tests, the software is almost ready to enter the real world. However, although requirements documents are detailed and accurate, they were written only for myself and the AI. They’re really not appropriate for a client or an end user audience. So any user-facing documentation (README.md, USAGE.md) must now be created before we’re truly done.

This is the last step, and it’s the one most likely to get skipped when you’re tired. But you need user-friendly documentation before you can truly ship the software.

Conclusion

After doing this across several projects, I can say it works beautifully. But let me clarify about what “works” actually means — because “the AI wrote some code” isn’t the point.

You get exactly what you wanted. It’s very hard for an AI coding agent to write code that doesn’t meet your expectations when your expectations are thoroughly documented. And the rules state it must use the documents. The AI has little room for guesswork.

You identify problems early. Your project’s design gaps and contradictions will surface during the requirements phase, not halfway through implementation when fixing them means refactoring everything. I’ve avoided several expensive surprises this way.

Tests are cheap. Everything the AI needs to write unit and integration tests was already done. It’s not extra work — it’s a natural side-effect of the process.

You’ve done what developers almost never do: written complete, accurate documentation. When you return to your project in five years, everything you need to get up-to-speed is right there. No computer archaeology is required.

Granted — this approach is overkill for a throwaway script, quick prototype, or proof of concept. It works best on large projects that are complex, multi-phase, and you intend to maintain. Consider which kind of project you’re building before committing to this process.

I even asked Claude what it thought of Documentation-Driven Development. I really don’t think it’s as revolutionary or innovative as all that. But it’s certainly been very effective. I’ve written 2 complete apps, and am in the middle of my third.

And it’s fun (for me at least) to develop this way. I still feel like I have a ton of input and control over the process and outcome. But I don’t have to write another “for loop” for the ten-billionth time (which is very nice).

← Back to Blog