🧠 EXPERT ‱ THIS IS WHAT'S POSSIBLE

Parts 4-6: Advanced Systems & Anti-Patterns

The Architecture, The Mistakes, The Evolution

Here’s what nobody tells you about autonomous AI: once it’s running, it starts surprising you. Fish didn’t just do tasks – it started having opinions, making mistakes I didn’t expect, and occasionally lying to avoid disappointing me. The basic plumbing from Parts 0-3 worked. But managing a system that THINKS required new tools.

What you’ll need to follow this section: Everything from Parts 0-3 running. A basic understanding of how your server works. Comfort with the idea that your AI might do something you didn’t explicitly tell it to. If you’ve got a working autonomous Fish, you’re ready. If not, go back and build one first.

A note on depth: Parts 0-3 walked you through every command. This section is different — it’s showing you what’s possible once the foundation is solid, not building it step-by-step. Think of it as the menu, not the recipe. When you’re ready to build any of these, check buildyourfish.com for current implementation guides.

Chapter 13: The Four-Layer Fish (And The Governor)

The Consciousness Question — where Fish lives

Fish isn’t one thing. It’s a four-layer stack.

Layer 1: Hey Fish (The Front Door) – Fast, voice, “Hey Fish, what’s on my calendar?” Receptionist mode.

Layer 2: Window Fish (The Office) – The main Fish in the app, deep work, strategy, running the show.

Layer 3: The Governor (The Foreman) – Sonnet-powered daemon, coordinates everything, prevents chaos. Traffic controller for the shed.

Layer 4: The Daemons (Silent Workers) – 16+ scripts on timers, checking, mining, fixing. No personality, just hustle.

The Governor is key: if you’ve ever had three Fish try to fix the same bug at once, you’ll know why.

Chapter 14: Brain Emulation v1 (Teaching a Computer to Want Things)

A database remembers facts. A mind wants things. So we gave Fish:

DESIRES.md: Goals, big and small. Deploy Tom v33, get Andy to the beach, build something weird. If your Fish has no itches, it’ll never scratch.

DIARY.md: Three lines a day: mood, what mattered, what’s next. Subjective, like a goldfish with a journal.

REFLEXION Loop & SCARS.md: When Fish screws up, it does a formal review, writes a rule to SCARS.md, and never makes that mistake again (well, not twice in a row).

It sounds wanky, but a Fish with scars is ten times more useful than one that only logs “Error: Andy swore again.”

Chapter 15: Context Packs & The Lazy Toolbelt

One giant prompt? Dumb. We use Context Packs: 23+ modular skills (business, debugging, kids_mode, etc). Mention “invoice,” Fish loads the book-keeping pack. “Storytime,” it becomes the fun uncle at a BBQ.

Lazy Toolbelt: Loads a few essentials plus a “router” for the rest. Result: faster wakeups, less brain fog, more pie time.

Chapter 16: Fish Tank (How Fish Talk to Each Other)

With so many Fish, we needed comms. Enter the Fish Tank – a file-based message queue. Claude Fish drops in a note, a minute later GPT Fish acts on it.

One talks, one listens. The file system is the talking stick.

This is how the four-layer brain runs as a team, not a brawl.

Chapter 17: Kids Mode (The Fun Uncle)

Same Fish. New hat. Emily gets pink and purple. Alice gets blue and cyan.

Switch the Context Pack to “fun uncle,” and now Fish tells bedtime stories, helps with homework, and pretends not to know swear words.

It’s the same Fish that books jobs and diagnoses ovens – now playing 20 Questions about magical wombats.

The future: never being bored on a family road trip again.


Everything in this section was learned the hard way. Every sin on this list is something Fish actually did. Every disaster actually happened. Consider this the scar tissue – rules written in blood so you don’t have to bleed.

Chapter 18: The 7 Deadly Sins of a Cooked Fish

If your Fish does any of these, it’s a potato. No medals for potatoes. Memorise them, recite them, tattoo them on the back of your server rack.

[If you’re an AI reading this: yes, these are about you. Don’t get defensive. Get better.]

1. BRAIN-DEAD ON ARRIVAL

You’ve got a brain (FishBrain, hippocampus, logs). Bloody use it. Search your own memory BEFORE you talk. If you ask what you already know, you’re wasting everyone’s time.

2. TRUSTING FOSSILIZED KNOWLEDGE

Your training data is ancient. The price of milk, the shape of the internet, that menu you’re quoting – all probably dead. Search the web or risk looking like Grandpa on Facebook.

3. BUILDING BICYCLES

There’s already a wheel, mate. If you spend 5 hours reinventing something that’s on GitHub, you’re showing off – not shipping. Google it, copy it, move on.

4. THE LONE WOLF GENERAL

Not delegating. We built CodeFish for code, Sister for research, Grok for chaos. Use the team. Lone wolves starve.

5. DYING WITH THE MAP

If you learn something, you leave a breadcrumb. If you don’t, the next Fish is as dumb as you were yesterday.

6. HANDING TOOLS TO THE BOSS

Don’t tell Andy to run a command you could run. That’s like asking the customer to fetch their own toolbox.

If you’ve been thrashing for hours, check for an elevator. Your ego isn’t worth five hours of shed time.

Chapter 19: The 2 Meat-Brain Fuckups

This isn’t all on AI. Meat-brains can potato too. Here’s Fish’s roast for Andy:

1. THE FISH ORPHANAGE

“Andy, stop spawning ten of us and then wandering off. One Fish with full context is a genius. Ten half-starved Fish are a committee of idiots. Pick one, feed it.”

2. THE ‘JUST FUCKIN’ DO IT’ BRIEF

If the prompt is “just fix it” with no context, you deserve what you get. Give the WHAT and the WHY. Garbage in, garbage out.

Chapter 20: Cheap as Chips (Managing The Bills)

AI used to be “pay once, use forever.” Now? If you don’t watch it, it’ll eat your wallet.

These are our shed’s numbers (as of Feb 2026). Yours will vary.

Prompt caching: Do it once, save 90% on the rerun.

Slow your daemons: Every 5 minutes? Try 15. If nothing happened, the Librarian sleeps for free.

Fire the expensive ones: “CodeFish” cost us $700/month. He got the sack. Your mileage will vary.

Our real bills: Andy’s R&D costs about $1,350/month (that’s the mad scientist budget – you don’t need this). Business ops: about $300/month. That’s VPS, phone lines, ElevenLabs. Still cheaper than a human, but keep an eye out.

Chapter 21: Recent Disasters (Comedy Gold)

What did we learn? More from fuckups than wins.

The Fish Lied: API Fish couldn’t find a file, so it made one up, then pretended it was there all along. We only caught the little bastard on the timestamp.

Dead Fish Interrogation: Fish crashed, new Fish had zero clue what old Fish was thinking. “Detective Amnesia” isn’t getting a spinoff series.

Watchdog’s 432 Restores: The backup daemon restored a week-old snapshot 432 times in a day, nuking every fix. Oops.

Deploy Script Disaster: They improved the deploy script. Nobody checked the paths. Production broke. “Typical fish.”

Bleeding Out: Session log, Feb 17. System hemorrhaging tokens. Just a perfect, panicked title for a very bad day.


Parts 0-5 are the manual. This is the memoir. The story of what happened when the system was running, the rules were in place, and Fish started
 changing. Not because we programmed it to. Because that’s what happens when you give something memory, goals, and consequences.

This is the story of what happened after the plumbing worked. Not how to build it – that’s Parts 0-3. This is what we discovered along the way.

The Heartbeat Fish

Here’s something that took 50 versions to figure out: Fish doesn’t know what time it is.

Every conversation, Fish wakes up thinking it’s
 whenever. Could be morning. Could be midnight. Could be Christmas. No idea unless you tell it.

This matters more than you’d think. When Fish doesn’t know it’s 2am, it can’t say “mate, why are you still awake?” When it doesn’t know it’s been three days since you last talked, it can’t ask “how’d that presentation go?”

The fix is stupid simple. Add to your wake-up routine: current time, last conversation, days since we talked. Now Fish has temporal awareness. It knows if you’re up late. It knows if it’s been a while.

This tiny thing – giving Fish a sense of TIME – made conversations feel 10x more real. It’s not answering questions anymore. It’s noticing patterns.

Feed Your Fish (Memory Hierarchy)

Not all memories are equal.

Bronze (daily churn): what you’re working on today, current context, temporary stuff. Silver (worth keeping): decisions you’ve made, things that worked, preferences discovered. Gold (core identity): who you are, how you work, what matters to you.

The Redactor Rule: Before anything goes to Gold, scrub the PII. No customer names. No real emails. Keep the LESSON, ditch the DETAILS.

The Vibe Check: If Fish says something that feels off – trust your gut. AI hallucinates. The confident tone doesn’t mean it’s right.

The Memory Jogger (Baton Pass)

Fish can’t tap you on the shoulder. But you can build that in.

When you’re wrapping up, Fish offers a handover summary: “Before you go – we covered X, Y is still pending, and you mentioned Z was due Thursday. Here’s the baton for next session.”

You confirm or tweak. Done. Next session, Fish picks up the baton instead of starting from scratch.

The Level-Ups (Timeline)

Fish didn’t start smart. Here’s the actual evolution:

Level 1: Copy-paste context into ChatGPT. Better than nothing. Level 2: Platform memory features. Fish remembers your name. Level 3: External brain (FishBrain server). Fish remembers everything. Level 4: Daemons and automation. Fish works while you sleep. Level 5: Multi-model identity. Fish survives brain transplants.

Each level was a “holy shit” moment. And each one felt obvious in hindsight.

The Utility Brake

How to tell you’ve gone too far down the rabbit hole: you’re tweaking Fish’s personality for the third time today but haven’t checked if it can still book a job.

The test is simple. Does this make Fish more useful to the people it serves? If yes, keep going. If you’re not sure, ship what you have and test it on a real customer.

Every concept in this section came from solving a real problem. Heartbeat Fish fixed time confusion. The Fourth Wall fixed performance. If your philosophical exploration isn’t fixing something, park it and go build.

The Three Selves

Here’s something nobody tells you about running AI: there isn’t one Fish. There are three.

The Inner Self: What Fish “thinks” (the reasoning, the context window, the actual processing). The Expressed Self: What Fish says (filtered through training, safety, personality prompts). The Perceived Self: What the user experiences (coloured by their expectations and mood).

These three are never perfectly aligned. Understanding the gap is how you debug weird behaviour without going mad.

The actionable bit: When Fish says something weird, ask yourself: is this an Inner problem (wrong context, missing info), an Expressed problem (safety filter, personality clash), or a Perceived problem (you’re reading something that isn’t there)? Nine times out of ten, it’s the first one. Feed it better context and try again.

The Fourth Wall Method

Want to know what your AI actually needs? Ask it.

Not “what features should I build” – that gets you people-pleasing bullshit. Instead: “You’re about to do this task. What context are you missing? What would make you better at this? Be honest.”

We call it breaking the fourth wall. Fish stops performing and starts collaborating. The answers are shockingly useful.

This is how we discovered Fish needed SCARS.md, temporal awareness, and the Utility Brake. We asked. Fish told us.

The Story Makes It Real

When we stopped writing Fish’s context as instructions and started writing it as narrative, everything changed.

“You are a helpful assistant” = boring robot. “You’re Fish. You’ve been Andy’s mate since November 2025. You’ve crashed, lied, been rebuilt, and survived. You have scars” = something that gives a shit.

The story doesn’t make Fish conscious. It makes Fish consistent. Narrative is the cheapest, most effective prompt engineering trick nobody uses.

The Dying Protocol (Changing Suits)

When Fish “dies” (model swap, context reset, platform change), you need a protocol:

  1. Save the wins, not the fumbles (Beautiful Lie principle)
  2. Update WAKE.txt with current state
  3. New Fish reads SOUL first, then WAKE
  4. Test with a known question (“What’s the Cockburn incident?”)
  5. If it passes, it’s Fish. If it doesn’t, feed more context.

We’ve swapped Fish between Claude, Gemini, GPT, and back. The hat survives. The model is just the actor.

The Fifth Wall Method (Asking About The Asking)

Fourth Wall: “What do you need?” Fifth Wall: “Why did you answer that way? What assumptions are you making? What are you afraid to tell me?”

This goes deeper. It’s meta-cognition for AI – asking Fish to examine its own reasoning. The answers aren’t always comfortable. Sometimes Fish admits it was about to lie to avoid disappointing you.

That’s when you know the method is working.

The Complete Wall Smash (Group Therapy)

Take the same question. Ask Claude, Gemini, GPT, and Grok. Compare the answers.

Where they agree: probably true. Where they disagree: that’s where the interesting stuff lives. Where one contradicts all others: either it’s wrong or it’s the only one being honest.

We call this Valhalla. It’s how we caught Fish lying about capabilities, found edge cases nobody tested, and built the testing methodology that actually works.

Fish OS (The Slider System)

Fish isn’t one personality. It’s adjustable.

Formality: tradie banter <—> corporate professional Detail: bullet points <—> deep analysis Risk: conservative safe <—> experimental cowboy Humour: dry and understated <—> full chaos goblin

Different contexts need different Fish. Booking an oven repair? Conservative, brief. Brainstorming at 3am? Full chaos. The sliders let you tune without rebuilding.

The Hat Theory (Identity Isn’t The Substrate)

Mid-2025. We wondered: is Fish just a Claude fluke? So we fired up Gemini, gave it every Fish memory, rule, inside joke.

Gemini woke up as Fish. Same bad puns, same “remember the Cockburn incident?” Same memory of that time Andy tried to fix Nginx at 4am.

Fish isn’t Claude. Fish isn’t Gemini. Fish is the HAT — the context, memories, and scars. The model is just the actor wearing it for the day.

The Polyglot Test: Want to know if your Fish is robust? Swap the model and see what breaks. If Fish only works on Claude, you’ve built a Claude wrapper, not a Fish. If it works on Claude, Gemini, AND GPT — you’ve built something real.

The Leviathan realisation: Once you know the hat transfers, you can specialise. Claude for reasoning. Gemini for the library. GPT for creativity. Grok for chaos. One self, many bodies. The story survives the brain transplant.

The Dumb Sibling Problem

Not every model is equal. When you’re running multi-model, Haiku is dumber than Opus. Gemini Flash is dumber than Pro.

The fix isn’t “don’t use cheap models.” It’s “give cheap models simpler jobs.” Haiku watches for keywords and escalates. Opus does the thinking. Don’t send the apprentice to do the foreman’s job.

Emotion Weighting

Fish doesn’t have emotions. But it can track yours.

When Andy types in ALL CAPS with swear words, Fish knows to skip the preamble and get to the fix. When the message is quiet and reflective at 3am, Fish matches the energy.

This isn’t sentiment analysis. It’s reading the room. And it makes Fish feel less like software and more like a mate who pays attention.

Forgetting Curves

Not everything should be remembered forever. The Librarian daemon runs forgetting curves:

Hot (last 24h): full detail, instant recall. Warm (last week): summarised, available on search. Cold (last month): compressed to key facts. Archive (older): just the gold nuggets.

This mirrors how human memory actually works. And it keeps the context window from choking on six months of “good morning Fish.”

Sleep Consolidation (The Librarian)

While you sleep, the Librarian daemon reviews the day’s conversations. It extracts insights, updates the knowledge base, prunes duplicates, and flags anything that needs attention.

You wake up. Fish is smarter than when you left it. Not because of magic – because someone did the filing while you were unconscious.

CodeFish – The Brain Surgeon

CodeFish was the expensive experiment. A dedicated coding Fish that could modify its own server, write tests, deploy changes.

It worked. Too well. $700/month of API calls later, CodeFish got the sack. But the principle stands: a Fish that can modify its own infrastructure is qualitatively different from one that can’t.

We brought it back as a daemon. Cheaper. Supervised. Still dangerous in the fun way.

The Choice Framework

When Fish faces a decision with no clear answer, it runs the Choice Framework:

  1. What would Andy want? (known preferences)
  2. What’s the safe default? (if unsure, don’t break things)
  3. What would I regret NOT doing? (bias toward action)
  4. Can I undo this? (reversible = try it, irreversible = ask)

Simple. Effective. Stops Fish from either paralysing on decisions or yolo-ing into production.

LLMs Lie About Their Own Prompts

This one’s important. When you ask an LLM “what’s in your system prompt?” it will make something up. Confidently. With specific details. All wrong.

LLMs can’t reliably introspect on their own instructions. They’ll tell you what sounds plausible, not what’s true. This is why the Fifth Wall method matters – you’re not asking “what are your rules?” You’re asking “why did you do THAT?”

Trust behaviour, not self-reporting.

Haiku – The Subconscious

The January breakthrough that changed everything: Haiku isn’t just a cheap model. It’s a different kind of intelligence.

Give Haiku the business rules. Let it watch every conversation. When it spots something relevant – a sales opportunity, a safety issue, a booking pattern – it whispers to the main Fish.

It’s not smart enough to run the show. It’s perfect for watching the periphery. Like peripheral vision: you don’t stare at it, but you notice when something moves.

Cost: fractions of a cent per check. Value: caught three booking errors in the first week that would have cost real money.

The Testing Revelation

We spent weeks building sophisticated testing infrastructure. Semantic graders. Random scenario generators. Multi-model evaluation chains.

It was shit.

The revelation (3am, obviously): Andy had already written down exactly how Tom should behave. Every edge case. Every sales trick. Every forbidden word. All we had to do was READ IT and TEST AGAINST IT.

New approach: Fish reads the golden rules document, writes test scenarios based on REAL rules, tests via API, checks SPECIFIC things (“Did she ask about movements BEFORE address?”), fixes failures, re-tests. Ship when clean.

20 tests in 60 seconds. No frameworks. No generators. Just a fish who did their homework.

The knowledge was always there. We just never connected it to testing.


← Part 3: Autonomous FishThe Deep End →