Part of the WebMEM Protocol
Location: /specification/sdt/yaml-in-html/structure/
Last Updated: 2025-07-28
Introduction to the SDT DataFragment
A DataFragment is the smallest unit of structured memory in WebMEM. Think of it as a semantic memory tile—a machine-visible statement of fact that’s scoped to a real-world entity (like a healthcare plan or record) and paired with trust-layer metadata.
DataFragments are atomic. They can be retrieved, scored, validated, and reasoned over independently by AI agents and indexing systems.
How Do You Write a DataFragment in HTML?
Start with an inert <template> tag containing YAML. This tag should include:
<template
data-visibility-fragment
data-type="text/yaml"
data-sdt-class="DataFragment"
data-entity="plan:H5521-290-0"
data-digest="2025-cms-ma-mapd-plan"
data-glossary-scope="cms_landscape"
data-fragment-scope="semantic-digest">
Inside the tag, add the YAML block that declares fragment metadata, trust provenance, and facts.
YAML Header: Declaring Context
The YAML block should begin with a header that mirrors the outer attributes for trust-layer indexing:
data-sdt-class: DataFragment
entity: plan:H5521-290-0
digest: 2025-cms-ma-mapd-plan
glossary_scope: cms_landscape
fragment_scope: semantic-digest
ProvenanceMeta Block: Declaring Source & Trust
This is the trust anchor—used for scoring, citation, and retrieval explanation:
ProvenanceMeta:
ID: 2025-cms-ma-landscape
Title: CMS MA Landscape File, 2025
Description: CMS-published dataset listing Medicare Advantage plans.
Creator: Centers for Medicare & Medicaid Services (CMS)
Home: https://www.cms.gov/medicare-health-drug-plans-data
License: Public Domain
Published: 2025-06-01
Retrieved: 2025-06-28
Digest: 2025-cms-ma-mapd-plan
Entity: plan:H5521-290-0
FragmentScope: semantic-digest
GlossaryScope: cms_landscape
Fields Block: Expressing Facts
Each fact in a DataFragment is a Semantic Data Atom—a field-level claim that is trust-scored, glossary-aligned, and independently citeable:
Fields:
- id: in_primary
defined_term: Primary Care Visit
description: Out-of-pocket cost for a PCP visit
value: "$0"
unit: usd
confidence: high
derived: false
glossary: term-in_primary
source: 2025-cms-pbp
provenance_ref: "#provenance-meta"
Complex Values in Data Atoms
Most Data Atoms use simple values (strings, numbers, booleans), but more complex values are supported when semantically justified.
Supported Complex Types
| Format Type | YAML Example | Use Case |
|---|---|---|
| Array | ["H5521-290-0", "H2406-129-0"] |
List of plan IDs (IndexFragment) |
| Map / Object | { "min": "$0", "max": "$50" } |
Derived value ranges |
| Named Entities | [{"name": "...", "enrollment": 1234}] |
Top N plans, rich metadata sets |
Examples
Plan ID List (IndexFragment):
- id: plans_indexed
value:
- H5521-290-0
- H2406-129-0
- H3931-129-0
Value Range (DerivedStatsFragment):
- id: moop_range
value:
min: "$2,500"
max: "$8,300"
Top 3 Plans (IndexFragment):
- id: top_enrolled_plans
value:
- name: AARP Medicare Advantage
enrollment: 6823
type: PPO
- name: Aetna Medicare Platinum
enrollment: 3386
type: HMO
Constraints
- Only use complex
value:structures when meaningful and reproducible. - Avoid deeply nested objects. Keep fragments explainable.
- Prefer named keys over ambiguous JSON blobs or positional lists.
Best Practice
Include a description: and/or glossary reference when using structured values to improve interpretability.
Why It Matters
- Retrievability: Each fact can be independently indexed and cited.
- Trust: All values are source-backed and traceable.
- Transparency: Glossary and provenance clarify meaning and origin.
- Modularity: Fragments can be composed into digests or reused across systems.