Part of the WebMEM Protocol
Location: /specification/sdt/yaml-in-html/fields/
Last Updated: 2025-07-28
What is a Semantic Data Atom?
A Semantic Data Atom is the smallest independently retrievable, source-backed unit of knowledge in the Semantic Data Template (SDT) system.
Each atom represents a single fact, value, or field-level assertion — typically derived from a structured dataset or glossary. These atoms are grouped inside the Fields: array of an SDT fragment, and each one may include optional metadata for trust, glossary alignment, or derivation context.
Semantic Data Atoms support:
- Fragment-level retrieval
- Export and memory indexing
- Trust scoring and provenance linkage
- Co-citation with glossary terms
Canonical Field Schema
| Field | Required | Description |
|---|---|---|
id |
✅ | Canonical field token; must be unique per fragment and match registry if present |
value |
✅ | The literal value associated with the field (string, number, boolean, array, or object) |
defined_term |
❌ | Human-readable label for UI/export (e.g., “Primary Care Visit”) |
description |
❌ | Inline explanation of the field’s meaning or calculation |
glossary |
❌ | Canonical term ID this field aligns with (e.g., gtp:zero_premium) |
unit |
❌ | Measurement unit (e.g., usd, percent, stars) |
confidence |
❌ | Qualitative trust signal (high, moderate, low) |
derived |
❌ | Boolean flag for whether the value is calculated vs sourced |
source |
❌ | Short token identifying the dataset this atom comes from (e.g., 2025-cms-pbp) |
provenance_ref |
❌ | Internal link to the trust anchor, typically #provenance-meta |
Example Field Block
Fields:
- id: in_primary
defined_term: Primary Care Visit
description: Out-of-pocket cost for a PCP visit
value: "$0"
unit: usd
confidence: high
derived: false
glossary: term-in_primary
source: 2025-cms-pbp
provenance_ref: "#provenance-meta"
Value Type Rules
The value: field may contain the following types:
| Type | Usage |
|---|---|
string |
Most common: text, currency, plain values |
number |
Quantitative values: stars, percentages, dollar amounts |
boolean |
True/false assertions (e.g., zero premium = true) |
array |
Lists (e.g., plan IDs, co-pays) |
object |
Structured values (e.g., min/max ranges, score breakdowns) |
Example: Value Map
- id: moop_range
value:
min: "$2,500"
max: "$8,300"
Example: Score Distribution
- id: star_rating_distribution
value:
"5.0": 2
"4.5": 5
"4.0": 8
Best Practices
- Always include
provenance_ref:to ground each atom to a declared source - Use
glossary:to align with a DefinedTermFragment or glossary page - Use
confidence:andderived:for transparency in calculated or interpreted data - Keep arrays flat; avoid deeply nested or ambiguous YAML structures
- If using a complex
value:, adddescription:to explain structure