Object / IdentityMaciej Trzciński
Post.Metadata / 2026.06.18 · 11 MIN READ

Scale Is a Count of Readers, Not Rows.

A data model becomes a product the moment something you can't redeploy in the same commit depends on its shape. The variable isn't rows — it's how many readers are outside your blast radius.

Key Takeaways // TL;DR
  • The scale that turns a schema into a product is a count of readers you can't change in the same commit — not a count of rows.
  • One external, uncontrolled reader is already a contract; ten in-repo readers that ship together cost almost nothing. The reader count is a proxy; the boundary is the variable.
  • Readers don't share your processing code — they share the shape. So the meaning (typed, single-purpose, validated at the source) is the real surface.
  • The agent is the last reader to arrive, not a new kind of problem — making content AI-ready is the same discipline a second uncontrolled reader already demanded.

You can rewrite a render function on a Friday afternoon and nobody will notice. The tests pass, the page looks the same, the one place that called it now calls the new version. Try the same move on the shape of the data underneath it — rename a field, narrow a type, change what null is allowed to mean — and you find out, surface by surface, how many things were quietly depending on it.

"Your schema is the product" is a true sentence said by everyone, which makes it useless until you say why. It is not the product because it is important, or because data outlives code, or any of the other things that sound wise on a slide. It is the product when changing it is the expensive thing — and that has almost nothing to do with how many rows it holds. A table with a hundred million rows and exactly one reader is cheap to reshape: you change the writer and the reader in the same commit and ship. The cost lives somewhere else.

The cost lives in the readers you can't change in that same commit. The web UI you own renders the model; fine, that ships with the schema. But the public JSON-LD feed search engines read, the sibling page type another team maintains, the partner export someone integrated against a year ago, the agent now generating against your content — those are readers outside your blast radius. You can't pull them aside and say just special-case it in the template. When a model picks up a reader like that, its shape stops being an implementation detail and becomes a contract. That is the whole thesis: scale, in the sense that makes a schema the product, is a count of readers you don't control — not rows.

A note on lineage before the argument. This is a companion to Authors Are Not a Field, which is about a different question — where the seams go inside one model, which concerns earn their own document. This post is one level up: the model itself as a surface that several independent things read. Distinct claim, referenced, not repeated.

With a single reader you own, a model is free to change because nothing is coupled to its shape that you can't edit in the same breath. The render code is the only thing that reads it; you change the shape and the reader together, atomically, and the diff is one PR. At that point the model genuinely is an implementation detail — calling it a product surface would be premature ceremony.

The phase change isn't a number of readers. It's the first reader you can't reach in that one commit. A public JSON-LD feed with exactly one consumer is already a contract — not because there are two of it, but because the consumer is external and uncontrolled. Symmetrically, ten in-repo call-sites that all live in one codebase and ship together cost almost nothing: a rename is a single PR that touches all ten. The reader count is a useful proxy because readers tend to spread across boundaries as a system grows — but the load-bearing variable is the boundary, not the tally. Count the readers; then ask which ones you can't redeploy in the same commit. Those are the ones that turned your shape into a surface.

Here is where it bites in practice: a field shaped for the one surface that consumes it. Presentation baked into the data — a pre-rendered heroTitleHtml, a displayDate string formatted "June 18, 2026" for the web. Convenient, exactly once. The same field shaped as meaning — a plain title and a publishedAt ISO instant — is one every reader can re-present its own way. The first shape is a favor to the renderer that becomes a bug for everyone else.

schema/post.shape.ts
TypeScript
1// Shaped for exactly one surface: the web renderer.2// Presentation is baked into the data.3type PostForTheWeb = {4  heroTitleHtml: string;   // "<h1>Scale Is a Count…</h1>"5  displayDate: string;     // "June 18, 2026" — already formatted, en-US6};7 8// Shaped as meaning. Every reader re-presents it:9// the web renders it, JSON-LD serializes it, the agent reasons over it.10type Post = {11  title: string;           // the fact, no markup12  publishedAt: string;     // ISO 8601 instant: "2026-06-18T09:00:00Z"13};
Listing 01 — convenience-for-one becomes a bug-for-many

Every consumer re-implements its own processing. The web renders the field, search serializes it into JSON-LD, the agent reasons over it, the export flattens it to CSV. None of them runs your code. What they all trust is the meaning the shape encodes — what values a field may hold, what its absence signifies. That is why an ambiguous shape is the expensive kind: status: string says anything-goes, so every reader has to guess and defend against the guess; date: string leaves four readers asking four questions — formatted or ISO, whose timezone, is empty a real value or a missing one.

Make the shape carry the meaning and the guessing stops. A discriminated union pins status to the three values it is actually allowed to be[2]; an ISO instant pins the date to one unambiguous reading. The point isn't never use a loose type — it's that the contract a reader depends on is the shape, so the shape is where the meaning has to live, enforceably, validated at the source rather than re-litigated at each reader.[1]

schema/status.ts
TypeScript
1// Ambiguous: every reader guesses, and the guesses drift apart.2type Loose = {3  status: string;   // "draft"? "DRAFT"? "published"? "live"? anything.4  date: string;     // formatted or ISO? whose timezone? is "" valid?5};6 7// The shape IS the contract — validated at the source, not at each reader.8type Tight = {9  status: "draft" | "published" | "archived";10  publishedAt: string;   // ISO 8601 instant; null only when status === "draft"11};
Listing 02 — the typed shape is the shared assumption, made enforceable
§ 03 / The Fan-Out

The readers, made legible.#

The abstraction earns nothing without the readers on the table, so here they are. This blog is the worked example. The body of every post is a content: ContentBlock[] — a discriminated union keyed on type. Three independent readers walk that same union and each does something different with it. Change a block's shape and all three have to agree.

ReaderWhat it does with the shapeWhat breaks if you change it
Render switch (ArticleContent.tsx)Switches on block.type to map each block to a viewA renamed type falls through to no case — the block vanishes
Exhibits index (ExhibitsView.tsx)Filters to code / graph-linear / table, auto-numbers themDrop or rename one of those types and the figure index miscounts
Table of contents (getHeadings, utils.ts)Filters to heading blocks, derives anchor ids for the TOCChange heading's shape and the TOC and its deep links break
Table 01 — Three readers of one discriminated union.
CONTENT.BLOCK[]
RENDER.SWITCH
EXHIBITS.INDEX
TOC.HEADINGS
Graph 01 — one shape, three readers, each re-implementing its own pass.

That is the fan-out, live — and it's also the cheap case, which is the honest part. All three readers live in one repo and ship in one deploy. If I narrow HeadingBlock, the compiler lights up all three call-sites in the same edit and I fix them in one PR. The shape is shared, but it is not a surface yet, because no reader is outside my blast radius. It costs almost nothing precisely because I control every reader.

The cost only arrives when a reader sits on the other side of a boundary I can't cross in one commit. The day this blog publishes a JSON-LD feed for the same BlogPost model, that feed is a reader search engines read on their own schedule — I can't redeploy it with my Friday change, and I can't tell it to special-case the field I just renamed. The mechanics of the fan-out are the same; what changes is that one of the readers is now uncontrolled. As a general matter, that is where a shape made for the web first has to bend for a second reader that wants it differently — and the moment it bends, the model is the surface, not the code.

Once a reader you don't control depends on the shape, API discipline isn't optional, it's just the truth of the situation. Additive changes are safe — a new optional field breaks nobody. Renaming, retyping, or removing a field is a breaking change to every reader, whether or not you ran a migration. And "we'll fix the consumers later" is the sentence that calcifies a stack: for the consumers you control, later is a chore; for the ones you don't, it's never.

ChangeReaders you controlA reader you don't
Add an optional fieldSafe — one PRSafe — additive, nobody breaks
Rename a fieldSafe — rename the call-sitesBreaks every reader that knew the old name
Loosen a type to escape a constraintConvenient todaySilently un-contracts the model — no reader can trust it
Change what null meansLocal fix, type-checkedSemantic break nothing type-checks across the boundary
Table 02 — The same change, two regimes.

The rule cuts both ways, or it's dogma. A genuinely single-reader model you fully control — an internal scratch type, a one-surface prototype, a throwaway import script — should stay cheap and changeable. Treating it as a public contract is the same over-engineering one level up: paying the versioning tax for readers that will never arrive. The cost is paid when an uncontrolled reader shows up, so don't pre-pay it. The discipline is knowing which model is which — and being willing to upgrade a scratch type to a surface the day its first external reader appears, instead of pretending that day won't come.

Which is where the agent comes in, and it comes in last on purpose. The AI-ready content conversation treats the agent as a new category of problem. It isn't. It's one more reader of a contract that already had several — and like the JSON-LD feed and the partner export, it's a reader you can't pull aside and tell to special-case the edge in the template. It generates against the shape you published. A typed, single-meaning, validated model is a thing it can resolve and produce against; a presentation-baked string blob is a thing it has to guess at, and guess wrong. The work that makes content survive an agent is the same work that made it survive the second human surface — typing the shape, fixing the meaning, validating at the source. The agent didn't create the coupling. It's just the reader that finally makes ignoring it impossible.

References
  1. [1]Hyrum's Law — with enough consumers, every observable behavior of a contract becomes something somebody depends on, regardless of what you promised.
  2. [2]TypeScript Handbook — discriminated unions, the typed shape that pins a value to its allowed cases.
Action / Contact

Building something on a content graph?

I help technical product teams model, federate and validate content at scale. Let's talk about your architecture.

Get in touch