Most engineering teams have the same LLM workflow: generate code, eyeball it in review, ship it, and hope. Code review and vibes. It works until it doesn't — and it stops working faster than anyone expects.
We took a different path. 107 custom ESLint rules, ~50,000 lines of rule logic, organized across 28 domain-specific config modules. Yes, maintaining 50,000 lines of lint rules is expensive. But each rule replaces a countless number of future code review comments, onboarding corrections, and 2 AM production incidents. The rules pay for themselves within weeks.
The Death of DX (As We Know It)
We have a sharper thesis: traditional Developer Experience (DX) is dying.
The standard DX playbook has always been about "Look how easy it is to build this feature!". Component libraries, utility functions, boilerplate generators, low-code platforms — all designed to make it easier and faster for a human to write code. The entire frontend ecosystem spent a decade optimizing for fewer keystrokes.
But if an LLM is writing the code, the question isn't "how easy is it to write?" but "how easy is it to verify?".
It doesn't matter if the code is painful to write, if the entity writing it cannot experience pain.
These rules are the new DX. A human would struggle to write even a basic feature without violating 5-10 of them. An LLM does too — but the rules give it a clear path to fix its mistakes and converge on correct code.
Rules as Architecture
Everyone who has ever used an LLM to generate code has experienced the constant nudging.
- "No, don't use a raw Error class"
- "Use our pre-existing API client instead of fetch()"
- "Don't validate inside the route handler, define a proper schema at the boundary"
The usual approach is just to slap another comment in the AGENTS.md doc. With 107 rules (and growing every week), that doc would be an unmaintainable mess. Instead, we write a lint rule for each invariant. Every time we have to tell the agent something, or run into a production bug, we stop and think "Could this be enforced statically?"
Enforcing Codebase Conventions
Consider a simple example. In our codebase, throwing new Error() is banned:
// ❌ This triggers no-new-error
throw new Error('Something went wrong');
// ✅ Must use typed errors with proper context
throw new Validation.Error('Invalid exam code', { examCode });TypeScript won't flag this. ESLint's built-in rules won't either. But an untyped Error means our error tracking can't categorize it, our API handlers can't distinguish user errors from server errors, and our error boundaries can't display meaningful messages. An LLM will write new Error() every single time unless something stops it.
Old DX says: "make error handling easy." New DX says: "make wrong error handling unrepresentable."
A few more from this category:
no-fetch-in-server — LLMs constantly generate server handlers that call back into the same server via fetch() to reuse logic. It works locally, breaks in production, and makes tracing impossible:
// ❌ Triggers no-fetch-in-server
async function createExam(input: ExamInput) {
const template = await fetch('http://localhost:3000/api/templates/default');
// ...
}
// ✅ Call the function directly
async function createExam(input: ExamInput) {
const template = await getDefaultTemplate();
// ...
}no-direct-ai-sdk-imports — Only one file in the entire codebase may import generateObject or generateText from the AI SDK. Everything else goes through a guarded wrapper that handles token limits, rate limiting, and audit logging:
// ❌ Triggers no-direct-ai-sdk-imports
import { generateObject } from 'ai';
// ✅ Use the guarded wrapper
import { generateObject } from '$lib/ai/client';Without this rule, every LLM-generated feature imports the SDK directly — it's the top autocomplete in their training data.
require-api-client — All API calls from client code must go through typed API clients. No raw fetch('/api/...'). This ensures type safety end-to-end and prevents URL string drift.
Supercharge the TypeScript compiler
TypeScript was built with human judgement in mind. It ships with escape hatches — any, as Type, @ts-ignore — that experienced developers use sparingly and for good reasons.
LLMs have no such judgement. They will use any or @ts-ignore at the drop of a hat. So we ban every escape hatch entirely:
// ❌ All of these are lint errors in our codebase
const data: any = response.body;
const id = input as string;
const code = input as unknown as ExamCode;
// @ts-ignore
brokenFunction(wrongArgs);If the types don't fit, you fix the types — not silence the compiler. More friction when writing, dramatically easier to review and maintain.
In Practice
In practice, most projects are messy, poorly structured, and all over the place. Believe us, we of all people would know.
Our first 6 months looked like:
- No typing at all (yes, really)
- Python backend, in a separate repo, with no documented API contracts
- CI wasn't a thing
Today, we're at a 100% TypeScript monorepo with 107 custom lint rules across 28 domain-specific configs, and a CI pipeline that gates every PR on types, lint, formatting, unit/integration/e2e tests, a11y, and dependency scanning.
This is obviously a massive oversimplification — each step here could be its own post. But the rough migration path:
- Migrate everything to TS, maximum strictness. This is the foundation
- Begin with high-ROI rules that prevent TS escape hatches or basic convention violation
- Build CI to fail on type- and lint errors
- Start each rule as a warning, promote to error when violations reach 0
- Track all warnings, make sure no PR can increase the number of warnings
Steps #3-#5 are what starts the flywheel. LLMs can fix essentially all lint warnings on their own, and even decide which ones are appropriate to fix first.
To give this teeth internally, we built a scoreboard that tracks lint warning fixes per PR. It's gamified — our top contributor is currently ranked "Grandmaster 1". Silly, but it works: every PR moves the codebase forward, not just the feature.
Like everything else, it's an iterative process. Not every rule survives. Some turned out to be too noisy, some conflicted with other rules, some enforced patterns that we later outgrew. We've deleted or rewritten dozens of rules over the past year. The 107 that exist today aren't the 107 we started with — they're the ones that earned their place.
Conclusion
Every rule is an entire class of "Don't do this" messages we never have to write again.
The results? In the past year we more than tripled output per engineer. We'll share the exact numbers in a future post.
The old DX asked: "how few lines does it take to build this?" The new DX asks: "how many ways can you build it wrong?"
107 rules is our answer. And we're adding more every week.
