Vanilla Breeze

Document Provenance

How Vanilla Breeze declares, displays, and (in time) verifies who made content, how it was made, and whether it has been tampered with since publication.

Document Provenance

An honest signal for honest publishers in an era of zero-cost AI content.

The problem

AI content generation has reduced the cost of publishing to near-zero. Visitors can no longer assume human authorship. Content farms produce convincing but inaccurate material at scale. Attribution and revision history are routinely stripped. There is no standard way for honest publishers to prove their content is what they say it is.

Vanilla Breeze's provenance layer doesn't solve the problem for bad actors — it gives honest publishers a way to signal trustworthiness to readers who care enough to check.

Three layers, additive

  1. Layer 0 — Hand-authored static HTML. A page can declare full provenance using <meta> tags and <dl> structures inside <page-info>. No build step required. See a minimal proof.
  2. Layer 1 — SSG-driven. A static-site generator (such as Cook SSG, which builds vanilla-breeze.com) reads frontmatter and emits the meta-tag contract automatically. Authors edit YAML, not HTML.
  3. Layer 2 — CMS with editor UI and signing. A CMS adds an editor interface, change-tracking inputs, and a signing step that runs at publish time. Pages are cryptographically verifiable in the reader's browser.

Each layer validates the ones beneath it. Vanilla Breeze owns Layer 0 — the components, attributes, and meta-tag contract — and is stack-agnostic above it.

The metadata substrate

Provenance is one field family in a single metadata substrate that every Vanilla Breeze lens component can consume. The full set of fields is documented in meta-tag contract v1; the short version is:

FamilyFieldsSource
Identity author, article:author, rel="author" HTML5, Open Graph
Temporal article:published_time, article:modified_time, last-modified, itemprop="version" Open Graph, Microdata
Topic keywords, vb:topic HTML5 / VB
Provenance vb:provenance, vb:review, vb:status, vb:ai-tools VB
Integrity vb:hash, vb:signature, vb:signature-algorithm, link rel="author-key" VB
License license, link rel="license" HTML5

The principle: public-first when public exists; vb:* only for private extensions. JSON-LD is the canonical machine-readable mirror.

Three orthogonal DOM attributes

AttributeAnswersWhere used
data-author WHO made this specific edit? <ins>, <del>
data-provenance HOW was the content made? <article>, <section>, <html>
data-review What review did it receive? same
data-status What's its publication state? same
data-trust Verification tier (computed at runtime) .page-info-badge only

Provenance, review, and status are author-declared claims. Trust is a runtime-computed result. Each answers exactly one question.

Lens architecture

Provenance metadata is consumed not just by <page-info> but by the entire family of view components — each presents the same data through a different lens:

The reader chooses the lens at read-time. Even when the author doesn't expose the controls, "the nature of the web always allows them to change it" — every lens exposes its underlying data via JSON-LD or queryable DOM, so readers, crawlers, and AI tools can all reframe the same content.

What signing does and does not prove

The integrity layer (still being built — see Stage 4 of the document-provenance plan) cryptographically signs the canonical content of a page. When complete, <page-info> verifies the signature in the reader's browser using the Web Crypto API.

Verification proves:

  • The content has not been altered since the author signed it.
  • The signature was made by whoever controls the key at the author-key URL.
  • The key is hosted on the same domain as the page (domain anchoring).

Verification does not prove:

  • That the content is accurate or factual.
  • That the author is who they claim to be.
  • That AI wasn't used (only data-provenance addresses that, and it's self-reported).
  • That the signing key hasn't been compromised.

The signing layer is a floor, not a ceiling. It defends against mid-stream tampering (CDN swap, proxy injection, post-publish modification). It does not defend against deliberate dishonesty.

See also