Document Provenance

How Vanilla Breeze declares, displays, and (in time) verifies who made content, how it was made, and whether it has been tampered with since publication.

Document Provenance

An honest signal for honest publishers in an era of zero-cost AI content.

The problem

AI content generation has reduced the cost of publishing to near-zero. Visitors can no longer assume human authorship. Content farms produce convincing but inaccurate material at scale. Attribution and revision history are routinely stripped. There is no standard way for honest publishers to prove their content is what they say it is.

Vanilla Breeze's provenance layer doesn't solve the problem for bad actors — it gives honest publishers a way to signal trustworthiness to readers who care enough to check.

Three layers, additive

Layer 0 — Hand-authored static HTML. A page can declare full provenance using <meta> tags and <dl> structures inside <page-info>. No build step required. See a minimal proof.
Layer 1 — SSG-driven. A static-site generator (such as Cook SSG, which builds vanilla-breeze.com) reads frontmatter and emits the meta-tag contract automatically. Authors edit YAML, not HTML.
Layer 2 — CMS with editor UI and signing. A CMS adds an editor interface, change-tracking inputs, and a signing step that runs at publish time. Pages are cryptographically verifiable in the reader's browser.

Each layer validates the ones beneath it. Vanilla Breeze owns Layer 0 — the components, attributes, and meta-tag contract — and is stack-agnostic above it.

The metadata substrate

Provenance is one field family in a single metadata substrate that every Vanilla Breeze lens component can consume. The full set of fields is documented in meta-tag contract v1; the short version is:

Family	Fields	Source
Identity	`author`, `article:author`, `rel="author"`	HTML5, Open Graph
Temporal	`article:published_time`, `article:modified_time`, `last-modified`, `itemprop="version"`	Open Graph, Microdata
Topic	`keywords`, `vb:topic`	HTML5 / VB
Provenance	`vb:provenance`, `vb:review`, `vb:status`, `vb:ai-tools`	VB
Integrity	`vb:hash`, `vb:signature`, `vb:signature-algorithm`, `link rel="author-key"`	VB
License	`license`, `link rel="license"`	HTML5

The principle: public-first when public exists; vb:* only for private extensions. JSON-LD is the canonical machine-readable mirror.

Three orthogonal DOM attributes

Attribute	Answers	Where used
`data-author`	WHO made this specific edit?	`<ins>`, `<del>`
`data-provenance`	HOW was the content made?	`<article>`, `<section>`, `<html>`
`data-review`	What review did it receive?	same
`data-status`	What's its publication state?	same
`data-trust`	Verification tier (computed at runtime)	`.page-info-badge` only

Provenance, review, and status are author-declared claims. Trust is a runtime-computed result. Each answers exactly one question.

Lens architecture

Provenance metadata is consumed not just by <page-info> but by the entire family of view components — each presents the same data through a different lens:

<page-info> — per-page disclosure panel
<time-index> — chronological lens (changelog, recently-updated)
<site-index> — keyword lens
<glossary-index> — definitional lens
<site-map> — structural-hierarchy lens
<change-set> — diff lens

The reader chooses the lens at read-time. Even when the author doesn't expose the controls, "the nature of the web always allows them to change it" — every lens exposes its underlying data via JSON-LD or queryable DOM, so readers, crawlers, and AI tools can all reframe the same content.

What signing does and does not prove

The integrity layer (still being built — see Stage 4 of the document-provenance plan) cryptographically signs the canonical content of a page. When complete, <page-info> verifies the signature in the reader's browser using the Web Crypto API.

Verification proves:

The content has not been altered since the author signed it.
The signature was made by whoever controls the key at the author-key URL.
The key is hosted on the same domain as the page (domain anchoring).

Verification does not prove:

That the content is accurate or factual.
That the author is who they claim to be.
That AI wasn't used (only data-provenance addresses that, and it's self-reported).
That the signing key hasn't been compromised.

The signing layer is a floor, not a ceiling. It defends against mid-stream tampering (CDN swap, proxy injection, post-publish modification). It does not defend against deliberate dishonesty.

Welcome to Vanilla Breeze

Document Provenance

Document Provenance

The problem

Three layers, additive

The metadata substrate

Three orthogonal DOM attributes

Lens architecture

What signing does and does not prove

See also