Let me tell you something that will sound wrong the first time you hear it.
The engineers shipping AI 100x faster than their peers are not using better models. They're not on some secret waitlist for GPT-5. They're not running proprietary fine-tunes. They are, in most cases, using the exact same Claude or GPT-4o that everyone else has access to.
The difference is architecture. And it fits on an index card.
I've been building AI products since before most people knew what a transformer was. I've shipped production AI at Google, Microsoft, and Sutherland. I've built Thoth AI — a production-grade OpenClaw alternative — in weeks while agencies were still writing proposals. And the single most important thing I've learned is this: the model is not the product. The harness is the product.
The Leak That Confirmed Everything
On March 31, 2026, Anthropic accidentally pushed the entire source code for Claude Code to the npm registry. 512,000 lines. I read it the same day it went up. And what I found confirmed everything I had been building toward.
The secret wasn't in the model weights. It was in the wrapper. Live repo context. Prompt caching. Purpose-built tools. Context bloat minimization. Structured session memory. Parallel sub-agents. None of that makes the model smarter in isolation. All of it gives the model the right context, at the right time, without drowning it in noise.
That wrapper has a name. I call it the harness. And the principle that separates the 2x builders from the 100x builders is deceptively simple: thin harness, fat skills.
What "Fat Skills" Actually Means
A skill file is a reusable markdown document that teaches the model how to do something — not what to do. The user supplies the goal. The skill supplies the process.
Here's the insight most people miss: a skill file works like a method call. It takes parameters. You invoke it with different arguments and get radically different capabilities from the same procedure.
I have a skill called /investigate. Seven steps: scope the dataset, build a timeline, diarize every document, synthesize, argue both sides, cite sources. Three parameters: TARGET, QUESTION, DATASET.
Point it at a safety scientist and 2.1 million discovery emails, and you have a medical research analyst determining whether a whistleblower was silenced. Point it at a shell company and FEC filings, and you have a forensic investigator tracing coordinated campaign donations.
Same skill. Same seven steps. Same markdown file. The skill describes a process of judgment. The invocation supplies the world.
This is not prompt engineering. This is software design, using markdown as the programming language and human judgment as the runtime.
The Anti-Pattern Killing Your AI Projects
I've consulted with dozens of companies trying to build AI products. The most common failure mode is always the same: a fat harness with thin skills.
You've seen it. Forty-plus tool definitions eating half the context window. God-tools with two-to-five-second MCP round-trips. REST API wrappers that turn every endpoint into a separate tool. Three times the tokens, three times the latency, three times the failure rate.
My own CLAUDE.md file was 20,000 lines at one point. Every quirk, every pattern, every lesson I'd ever encountered. Completely ridiculous. The model's attention degraded. Claude Code literally told me to cut it back. The fix was about 200 lines — just pointers to documents. A resolver loads the right one when it matters. Twenty thousand lines of knowledge, accessible on demand, without polluting the context window.
The Architecture That Compounds
These ideas compose into a three-layer system.
Fat skills on top. Markdown procedures that encode judgment, process, and domain knowledge. This is where 90% of the value lives. Every improvement to the underlying model automatically improves every skill.
A thin CLI harness in the middle. About 200 lines of code. JSON in, text out. Read-only by default. It does four things: runs the model in a loop, reads and writes files, manages context, and enforces safety. That's it.
Your deterministic application on the bottom. QueryDB, ReadDoc, Search, Timeline — fast, narrow, reliable. Same input, same output, every time.
The principle is directional. Push intelligence up into skills. Push execution down into deterministic tooling. Keep the harness thin.
The Instruction That Got 2,500 Bookmarks
I tweeted something recently that resonated more than I expected:
"You are not allowed to do one-off work. If I ask you to do something and it's the kind of thing that will need to happen again, you must: do it manually the first time on 3 to 10 items. Show me the output. If I approve, codify it into a skill file. If it should run automatically, put it on a cron. The test: if I have to ask you for something twice, you failed."
People thought it was a prompt engineering trick. It's not. It's the architecture. Every skill you write is a permanent upgrade to your system. It never degrades. It never forgets. It runs at 3 AM while you sleep. And when the next model drops, every skill instantly gets better.
That's how you get to 100x. Not a smarter model. Fat skills, thin harness, and the discipline to codify everything.
The system compounds. Build it once. It runs forever.
About the Author
Ajay Jetty
Founder & CEO of Jetty AI. Serial founder, AI operator, and published researcher (CTMA). Formerly Google, Microsoft, Sutherland. Building production AI that ships in weeks, not quarters.
jettyai.cloud