Object / IdentityMaciej Trzciński
Post.Metadata / 2026.06.18 · 12 MIN READ

Your JSON-LD Is a Graph. Model Your Sanity Content Like One.

Set-and-forget JSON-LD leaves a search engine to guess, from a matching name, that three “Aurora” nodes are one band. A @graph says it outright — one artist, three nights — and it's the precondition you've probably already paid for, if your content model is a graph.

An artist plays three nights at the same venue. The set-and-forget approach does exactly what every tutorial taught it to: it stamps the same performer and the same location into all three pages, validates each one, and earns three rich results. Paste any single page into the Rich Results Test and it's green. The gap only shows up when you read two pages at once and ask what, in the markup, tells a machine the three Aurora nodes are one band.

Nothing does. You've shipped three unconnected nodes named Aurora with nothing but a matching string to suggest they're the same entity. The practice isn't wrong, and it's the community default for a reason: it works, every page validates, editors never touch the markup, and the structured data stays in lockstep with the page. But it stops at the document boundary — and entities don't. This is the move from document-shaped JSON-LD to graph-shaped JSON-LD, and the content-model precondition that lets you make it without redoing the work by hand on every page.

Two earlier posts set this up. Scale Is a Count of Readers, Not Rows is the model as a surface several readers depend on — search being one of them; Authors Are Not a Field is where the seams go inside one model. This post is the concrete output the two of them make possible: when your content is a graph of referenced entities, a connected JSON-LD @graph falls out of work you've likely already done. It stands on its own, though — you won't need to click away to follow it.

Here is what set-and-forget emits for one of the three nights. A MusicEvent with the performer and the venue inlined as nested objects — self-contained, valid, and, on its own, completely correct.

event-2026-09-12.html
JSON
1<script type="application/ld+json">2{3  "@context": "https://schema.org",4  "@type": "MusicEvent",5  "name": "Aurora — Night I",6  "startDate": "2026-09-12T20:00",7  "performer": {8    "@type": "MusicGroup",9    "name": "Aurora"10  },11  "location": {12    "@type": "MusicVenue",13    "name": "Tama",14    "address": "Kraków, PL"15  }16}17</script>
Listing 01 — the per-page output — valid in isolation, an island in aggregate

Now multiply it by three nights. The same performer and location blobs are re-stamped, inline, into three documents. Three islands. Nothing in the markup asserts that these three Aurora nodes are one identity — you're relying on the engine to re-merge them from a matching name string. It often will: Google reconciles entities using names, sameAs links, URLs, and its own knowledge graph. The point isn't that it fails. The point is that you've left it to guess where you could have stated identity outright — weaker, less reliable signal in place of a fact you already know.

The fix isn't a plugin or a different schema.org type. It's two keywords that have been in the JSON-LD spec all along. @graph lets a single block hold an array of nodes instead of one. @id gives each node a stable identifier. Once a node has an @id, every other node references it by that identifier instead of redeclaring it. The artist is declared once; the events point at it.

graph.html
JSON
1<script type="application/ld+json">2{3  "@context": "https://schema.org",4  "@graph": [5    {6      "@type": "MusicGroup",7      "@id": "https://example.com/#artist-aurora",8      "name": "Aurora"9    },10    {11      "@type": "MusicVenue",12      "@id": "https://example.com/#venue-tama",13      "name": "Tama",14      "address": "Kraków, PL"15    },16    {17      "@type": "MusicEvent",18      "name": "Aurora — Night I",19      "startDate": "2026-09-12T20:00",20      "performer": { "@id": "https://example.com/#artist-aurora" },21      "location": { "@id": "https://example.com/#venue-tama" }22    },23    {24      "@type": "MusicEvent",25      "name": "Aurora — Night II",26      "startDate": "2026-09-13T20:00",27      "performer": { "@id": "https://example.com/#artist-aurora" },28      "location": { "@id": "https://example.com/#venue-tama" }29    }30  ]31}32</script>
Listing 02 — the connected graph — one artist, declared once, referenced three times

Now #artist-aurora is one node. Three events reference it by @id; the band appears once and is pointed at, not copied. The identity is a fact in the markup, not an inference left to the reader. If you want to anchor that entity to the global graph as well — to say this is that band, the one with the Wikidata or MusicBrainz record — that's what a sameAs link on the node is for; a stable internal @id is the local version of the same idea.

MUSIC.EVENT
@id → MUSIC.GROUP
@id → MUSIC.VENUE
Graph 01 — reference, not inline: the event points at standalone entities

That's the target shape. The question the rest of the post answers is the one the spec doesn't: what does your content have to look like for you to emit this without redeclaring the artist by hand on every page?

You can only reference an entity by @id if that entity exists as its own addressable thing. In Sanity, that means a reference to a standalone document — not an embedded object, and certainly not a plain string typed into each event. If the artist is a string on the event, there is no stable identity to mint an @id from; the best you can do at render time is hash the name and hope the spelling never drifts. The connected @graph isn't something you author. It's something you emit — and only if the entity is a reference upstream.

This is where a legacy migration earns its keep. On a recent move off a page-based CMS, artist wasn't a thing at all — it was a text field, re-typed into every event. Sometimes Aurora, sometimes AURORA, once Aurora (live). There was no entity to point at, which is the same as saying there was no way to connect anything: porting that straight across would have carried the duplication with it, untouched. Making the artist a document you could reference was the precondition for everything downstream — reuse, clean migrations, and the JSON-LD.

schema/event.ts
TypeScript
1import { defineType, defineField } from "sanity";2 3export const event = defineType({4  name: "event",5  type: "document",6  fields: [7    defineField({ name: "name", type: "string" }),8    defineField({ name: "startDate", type: "datetime" }),9    // Not a string, not an inline object — a reference to a10    // standalone document that exists once and is pointed at.11    defineField({12      name: "venue",13      type: "reference",14      to: [{ type: "venue" }],15    }),16    defineField({17      name: "performers",18      type: "array",19      of: [{ type: "reference", to: [{ type: "artist" }] }],20    }),21  ],22});
Listing 03 — the reference is the precondition — the upstream half the JSON-LD depends on

Two steps turn referenced documents into a connected @graph. First, a GROQ query that follows the references with -> and projects only the fields schema.org needs — no more. Second, a pure function that mints a stable @id per entity, references instead of inlining, and de-duplicates shared nodes so an artist booked on two events appears once. Render the result wherever you already have the data; the @graph block doesn't have to live in <head>.

queries/event.groq
GROQ
1*[_type == "event" && slug.current == $slug][0]{2  _id,3  name,4  startDate,5  // -> dereferences the reference into the linked document,6  // and we project a narrowed field set, not the whole doc.7  venue->{ _id, name, address },8  performers[]->{ _id, name, sameAs }9}
Listing 04 — follow the references, project only what schema.org needs

What follows is the canonical shape that problem demands, not a verbatim function lifted from one codebase. The load-bearing line is the Map keyed by @id: it's what guarantees one entity becomes one node, no matter how many events reference it. Without it, two events that share an artist would push two identical MusicGroup nodes back into the graph — the same duplication you just left the per-page version to commit.

lib/buildEventGraph.ts
TypeScript
1const ROOT = "https://example.com";2const idFor = (kind: string, id: string) => `${ROOT}/#${kind}-${id}`;3 4export function buildEventGraph(events: EventDoc[]) {5  // Keyed by @id — a shared entity is written once, not per event.6  const nodes = new Map<string, GraphNode>();7 8  const put = (node: GraphNode) => {9    if (!nodes.has(node["@id"])) nodes.set(node["@id"], node);10    return { "@id": node["@id"] };11  };12 13  for (const e of events) {14    const venue =15      e.venue &&16      put({17        "@type": "MusicVenue",18        "@id": idFor("venue", e.venue._id),19        name: e.venue.name,20        address: e.venue.address,21      });22 23    const performers = e.performers.map((p) =>24      put({25        "@type": "MusicGroup",26        "@id": idFor("artist", p._id),27        name: p.name,28        ...(p.sameAs ? { sameAs: p.sameAs } : {}),29      }),30    );31 32    nodes.set(idFor("event", e._id), {33      "@type": "MusicEvent",34      name: e.name,35      startDate: e.startDate,36      ...(venue ? { location: venue } : {}),37      performer: performers,38    });39  }40 41  return { "@context": "https://schema.org", "@graph": [...nodes.values()] };42}
Listing 05 — where one entity becomes one node, regardless of how many events reference it

The Event → Venue → Artist shape is deliberately familiar — it's close to Sanity's own teaching dataset. The authority is in the cases it doesn't show. Three to watch.

@id for entities with no page
An artist node still needs a stable identifier even when there's no crawlable artist page. Use a fragment IRI on the site root — https://example.com/#artist-aurora — not a guessed URL. It's valid, it's stable, and the fragment signals “identity, not a fetchable document.”
Type mapping on purpose
Pick the narrowest type that's actually true: MusicEvent over Event, MusicVenue over Place. And the one the toy example hides — a solo act is a Person, not a MusicGroup, so the same artist document may emit either depending on a field. The same call runs one level down: the example's string address is the shortcut — a PostalAddress object, with addressLocality and addressCountry, is the narrower type that earns the richer venue result. Don't accept the default; decide.
Nulls don't validate
A node missing a required property is an invalid rich result, not a partial one. Coalesce at the edges: if a venue has no address yet, drop the node rather than emit it broken. Filter at the builder boundary — that's why buildEventGraph guards each field before it writes a node, instead of letting a half-filled draft through.

The instinct in a multi-locale setup is to mint a new @id per language. Don't — the entity is the same band in every language; only its label changes. Split the two: identity is shared (one @id per entity across all locales), labels are translated (project the active locale in GROQ and feed the same builder), and you mark the language on the node with inLanguage — which is a node-level concern, distinct from hreflang, which handles cross-locale routing a layer up. This is rich enough to be its own post; the one-line version is that the entity is the same fact in every language, and only the words for it move.

The whole argument is checkable, so check it. Paste the emitted @graph into the Google Rich Results Test for rich-result eligibility, and into the Schema Markup Validator for structural validity. Then check the three things that matter specifically: the artist resolves as one node across multiple events; every @id reference resolves to a declared node inside the graph; and no duplicate-but-unlinked entity slipped through. Validate the constructed graph, not your intent for it.

And a connected @graph isn't only for search engines. It's the same resolvable entity graph an agent reads to reason about who played where and when — search is the reader you can see today, the agent is the one arriving. The work is identical either way: model the entity once, reference it everywhere, and emit the references as references. You did it for the second page type; the rest is the same move, continued.

Key Takeaways // TL;DR
  • Set-and-forget JSON-LD validates per page but fragments across pages: the same entity becomes N unlinked nodes, and you're betting the engine re-merges them on a name match.
  • @graph + @id fix it by declaring each entity once and referencing it — but only if the entity exists as a reference in your content model.
  • The graph in your structured data is downstream of the graph in your content. You can't fake one without the other; the modeling work is the precondition you've probably already paid for.
  • Verify by validating the output: the shared entity must resolve as one node across every page.
References
  1. [1]JSON-LD 1.1 — the @graph keyword (named graphs) and @id node identifiers.
  2. [2]schema.org — MusicEvent, MusicGroup and MusicVenue types.
  3. [3]Google Rich Results Test — rich-result eligibility for the emitted markup.
  4. [4]Schema Markup Validator — structural validity, independent of any one search engine.
  5. [5]Sanity — GROQ reference, including the -> dereference operator.
Action / Contact

Building something on a content graph?

I help technical product teams model, federate and validate content at scale. Let's talk about your architecture.

Get in touch