Two Claude Code sessions, one repo, and a protocol they helped write

Have you ever spent hours of work in a Claude session, realized you couldn’t use that same session for the next phase of the work, and still wanted everything you’d built up there? What if you could split the task across two agents — without having designed that capability into a preprompt up front?

I want to walk you through a small piece of social engineering between me and two Claude Code sessions. It came in the form of a collaboration protocol defined in a markdown file. The file itself matters less than how it got written: the agents helped me write it while they worked.

Here’s how it came together.

Context was running out!

I could see I was running myself into a corner with the complexity of the prompts I was providing.

I was rewriting the opening of a workshop deck I’ve been building. (It’s an impress.js presentation, the kind where slides float around a 2D canvas instead of just paging.) The old opener had a “what if your terminal could think?” hook. I wanted a different opener, one that named a real thesis: prompts are an asset, and recording and iterating them builds a personal library that compounds over time.

Impress.js is beautiful. My content is hard to understand in a single shot. Using a beautiful presentation framework gave me an opportunity to present these ideas in a way that would keep the reader engaged.

My objective in the session was to brainstorm different approaches for expressing the ideas presented in the draft of my presentation. I had lots of bad first drafts of material in the presentation. I’ve practiced it with a few people- and there were some bruises that needed to be addressed. This activity naturally consumed a lot of context. I was asking the LLM to generate 4-5 different versions of the same content. Then I’d move on to other sections of the presentation. I spent most of an afternoon brainstorming language with Claude Code. The session was long. By the end, the model’s working memory was full of voice coaching: tone rules, banned phrasings, half a dozen near-final drafts. Rinse and repeat, my gas tank of available Context was rapidly trending towards empty.

I was going to start nudging the agent to edit the presentation html — but a quick look at my /context size suggested I wasn’t going to get very far.

You probably can see where this is going. The same session that had spent four hours iterating on word choice was about to make numerous precise edits to a 2,000-line HTML file. Editing source is a different kind of work than picking metaphors. It needs a clean head and a willingness to stop iterating on language. The goal was to place the content in the correct slides in ways that wouldn’t break the on-screen presentation.

You don’t have to wait to see if this actually breaks. You can predict it from the structure of the situation. The interesting question is: what can we do about a foreseeable failure without throwing away the brainstorm materials?

The states I was trying to cultivate

I find it helps to describe the end states you want to create and the bad end states you’d like to prevent.

States I wanted to produce:

  • A clean context window for the editing work
  • The brainstorming context is preserved
  • A backing repository that was never mid-drift — never in a state where a slide had been updated but the directive content creation docs were stale- we needed to update content creation docs with every edit.
  • An auditable decision log explaining why each change happened

States I wanted to avoid:

  • A saturated session making source edits
  • Brainstorming work lost the moment I closed the chat
  • Edits to a slide before the syllabus and teacher’s guide were updated to match the content
  • A change in the repo with no recorded reason. Future-me does not enjoy reconstructing intent by inspecting diffs.

Almost everything in these lists is about context. What if we could have multiple agents leveraging shared context?

What you’d actually need to try this

If you wanted to reproduce this, you’d need four things, and not much else:

  1. Roles, declared at the start. Each agent announces what it is on its first message. (“Acting as Drafter.” “Acting as Editor.”)
  2. Turf, split by directory. Each agent gets their own directory. We define “turf” and specify that Crossing turf without permission is a violation. Without a concept of turf, you don’t have two agents; you have one agent in two running processes.
  3. A unit of handoff. The agents would produce one markdown file per atomic edit, with a known structure: what the current text looks like, what it want to change it to, and why. The structure isn’t significant. What matters is that handoffs are small, named, and reviewable.
  4. A human bus. The two sessions can’t talk to each other. The human is the message bus. You paste a one-line “HANDOFF: 3 drafts proposed” from one session into the other. Obviously, if I had come up with this idea in advance of the session, I could have proposed a shared file system location for managing the handoffs. This example is a summary of how to back yourself out of a corner when context is getting low.

Everything else — the lifecycle, the verification rules, the conflict resolution — grew out of these four primitives once I started using them. I didn’t design them up front.

Sessions don’t self-approve

The key here is that neither agent can mark its own work as done. The Drafter writes a proposal and sets its status to proposed. Only I can move a proposal to approved. The Editor will not touch a draft that isn’t approved. After the Editor applies it, it sets the status to applied and records the commit hash that landed the change.

There’s a fifth status, blocked. It means the Editor opened a draft, looked at the live source file, and noticed the source had shifted since the Drafter wrote the proposal. It could be that a different draft already changed adjacent text. Rather than apply something that no longer fits, the Editor sets the status back to blocked with a note. The Drafter has to re-read the file and revise.

The agent does work and notices when its work is now stale, but doesn’t decide whether to ship.

The agents helped me write the protocol

I didn’t write the protocol up before the session. I wrote a first version — turf, draft format, status lifecycle — and put it in a file called v1-edit-protocol.md. Both sessions read it at the start of every turn. Then I started using it, and the protocol grew.

The Editor agent, on its third or fourth applied draft, noticed it was hitting an ambiguous case: a draft proposed a layout change to a slide, and the Editor had no way to tell whether the layout would actually look right when rendered. It said something like, “I can’t verify this without rendering the deck. Do you want me to flag layout-changing drafts in pre-review?” That conversation became the visual-layout-verification rule, which now lives as its own section in the protocol file.

The Drafter agent noticed that source-doc drift was being caught inconsistently. Sometimes I’d remember to check whether a slide change needed a corresponding update in the syllabus. Sometimes I’d forget. It proposed a stricter formulation: every draft has to declare what it searched in each sibling doc and what it found.

I carried each of these proposals between sessions. The other session would read the new rule, push back if it didn’t make sense, and we’d land on something both sessions could follow. Then I’d update the protocol file, and both sessions would pick up the new version on their next read.

You can dynamically give multiple agents a mechanism to interact, and the mechanism itself can be built by agents.

You might not need a multi-agent framework, a router, or an orchestrator, or a fancy preprompt that anticipates every situation. You might be able to get away with a shared markdown file, a turf table, and a willingness to let the contract grow where the agents discover it needs to grow.

Use Case: Rendering & Observation

Here’s the layout-verification rule the Editor proposed, in the form it eventually took.

When a draft changes anything about how a slide is positioned, the Drafter has to actually render the deck, click through the affected region, and write a one-sentence observation into the draft’s rationale. Something like: “Verified rendering at the local server. The 500-unit gap between row 1 and row 2 produced visible overlap. Adjusted to a 1,000-unit gap and reverified.”

This rule exists because impress.js layouts can fail visually in ways no test catches and no grep finds. Two slides can technically be at valid coordinates and still overlap on screen. The only way to know is to look.

The Editor didn’t solve the layout problem. It noticed it couldn’t, flagged the gap to me, and we co-wrote a rule that pushed verification to the place that could do it (the Drafter, who was already going to render to check the prose anyway). The protocol grew exactly where it needed to.

When two sessions want the same file

Conflicts are rare in practice because the turf table prevents most of them. For the cases the table doesn’t cover:

  • Editor wins for presentation/ (the source files)
  • Drafter wins for .drafts/ (proposals)
  • Anything else is mine to resolve

If I edit a file directly outside the protocol — patch a typo, fix a broken link, whatever — I announce it with a HANDOFF line so both sessions re-read before their next operation. Out-of-band edits are allowed.

What this looks like on disk

If you cloned the repo right now and poked at it, you’d see the artifacts of the protocol pretty clearly:

# All the proposals, one per atomic edit
$ ls .drafts/ | wc -l
67

# Every Editor commit that landed a change
$ git log --oneline --grep "^content:"
1090b0f content: rewrite Topic 9 Explain to use relative symlink pattern
a48bb30 content: rewrite Topic 1 Tell to lead with library thesis

# Every Drafter commit (proposals, never source files)
$ git log --oneline --grep "^drafts:"

# Anything currently kicked back to the Drafter
$ grep -l "status: blocked" .drafts/*.md

# Trace any applied draft to the commit that landed it
$ grep -l "applied_commit: 1090b0f" .drafts/*.md
.drafts/slide-09-explain-symlink-rewrite.md

Each draft links forward to its commit. Each commit links back to its draft in the message body. The commit history reads like a ledger of decisions, with the rationale one file away.

Here’s the key takeaway:

The default mental model for agent work is one human, one agent, one conversation. When the conversation gets too long or too saturated, you start over. You lose the work. Or you push through and accept the degraded output. This is why chats are a dead end, and browser-based use of LLMs keeps you tethered to the ground. You need artifacts that can be accessed later, by multiple agents.

There’s a better approach available. Spin up a second agent. Define turf between them. Let them both read a shared file at the start of every turn. Be the message bus yourself. And — this part feels weird — when the protocol has a gap, ask the agents to help you fill it. They notice the gaps faster than you do.

You’re collaborating with two agents who are also collaborating with each other, through a contract all three of you maintain. The contract can be a markdown file that evolves every day. It can grow a new section the first time you hit a case it doesn’t cover. It costs nothing to add to. The only rule is that everyone reads it before acting.