Part of the WebMEM Protocol
Location: /specification/sdt/yaml-in-html/provenance/
Last Updated: 2025-07-28
Overview
The ProvenanceMeta block anchors every SDT fragment to its source, license, scope, and trust lineage. While required fields ensure basic traceability, enhanced provenance declarations improve downstream retrievability, auditability, and trust scoring.
This section defines best practices for structuring a complete ProvenanceMeta block in YAML-in-HTML SDT fragments.
Scope
The Scope field clarifies the structural origin of the fragment’s content. Use controlled vocabulary values to maintain consistency and interoperability:
| Scope Value | Description |
|---|---|
single-dataset |
Derived directly from one source dataset (e.g., CMS MA Landscape ZIP) |
multi-dataset |
Harmonized from multiple distinct datasets |
derived |
Computed or aggregated from source values (e.g., averages, rankings) |
editorial |
Authored manually or guided by policy; used in curated content like FAQs or glossary entries |
Guidance: When in doubt, favor
derivedoreditorialto clarify transformation depth for downstream agents.
Format
Use clear, specific labels for the structure or archive type of the original source data.
Examples:
- ZIP (XLSX)
- CSV
- JSON
- Markdown
Avoid vague formats like spreadsheet or file. If a file is inside an archive, list the archive format (e.g., ZIP) and clarify the filename in the Description field.
Version
Use this to indicate structured versioning for datasets or glossary files.
Examples:
v2025.0— Baseline CMS releasev2025.1— Patch or post-correction filev1.2.3— Internal editorial or glossary revision
Only include if the dataset is expected to be referenced or evolved over time.
Tags
Use short, lowercase tags to categorize and cluster fragments.
Example:
Tags:
- cms
- landscape
- 2025
- pbp
Tags are flat, non-hierarchical, and non-canonical — intended for internal filtering and export behavior.
Guidelines
When a dataset or content source is governed by regulation or editorial policy, cite the URL or document name explicitly.
Examples:
- CMS Medicare Marketing Guidelines
- HHS Data Governance Framework
- Internal Trust Publishing Editorial Policy
Note: Fragments without a declared
Guidelines:field may be scored lower in regulated publishing domains.
Other Notes
- Use ISO 8601 format for all dates (e.g.,
2025-07-28) - Use SHA-256 or MD5 for
Checksum:field when declaring raw file integrity - The
Digest:field must match across all fragments that belong to the same exportable set