Semantic Data Atoms: Fields Block Specification

Part of the WebMEM Protocol
Location: /specification/sdt/yaml-in-html/fields/
Last Updated: 2025-07-28

What is a Semantic Data Atom?

A Semantic Data Atom is the smallest independently retrievable, source-backed unit of knowledge in the Semantic Data Template (SDT) system.

Each atom represents a single fact, value, or field-level assertion — typically derived from a structured dataset or glossary. These atoms are grouped inside the Fields: array of an SDT fragment, and each one may include optional metadata for trust, glossary alignment, or derivation context.

Semantic Data Atoms support:

Fragment-level retrieval
Export and memory indexing
Trust scoring and provenance linkage
Co-citation with glossary terms

Canonical Field Schema

Field	Required	Description
`id`	✅	Canonical field token; must be unique per fragment and match registry if present
`value`	✅	The literal value associated with the field (string, number, boolean, array, or object)
`defined_term`	❌	Human-readable label for UI/export (e.g., “Primary Care Visit”)
`description`	❌	Inline explanation of the field’s meaning or calculation
`glossary`	❌	Canonical term ID this field aligns with (e.g., `gtp:zero_premium`)
`unit`	❌	Measurement unit (e.g., `usd`, `percent`, `stars`)
`confidence`	❌	Qualitative trust signal (`high`, `moderate`, `low`)
`derived`	❌	Boolean flag for whether the value is calculated vs sourced
`source`	❌	Short token identifying the dataset this atom comes from (e.g., `2025-cms-pbp`)
`provenance_ref`	❌	Internal link to the trust anchor, typically `#provenance-meta`

Example Field Block

Fields:
  - id: in_primary
    defined_term: Primary Care Visit
    description: Out-of-pocket cost for a PCP visit
    value: "$0"
    unit: usd
    confidence: high
    derived: false
    glossary: term-in_primary
    source: 2025-cms-pbp
    provenance_ref: "#provenance-meta"

Value Type Rules

The value: field may contain the following types:

Type	Usage
`string`	Most common: text, currency, plain values
`number`	Quantitative values: stars, percentages, dollar amounts
`boolean`	True/false assertions (e.g., zero premium = true)
`array`	Lists (e.g., plan IDs, co-pays)
`object`	Structured values (e.g., min/max ranges, score breakdowns)

Example: Value Map

- id: moop_range
  value:
    min: "$2,500"
    max: "$8,300"

Example: Score Distribution

- id: star_rating_distribution
  value:
    "5.0": 2
    "4.5": 5
    "4.0": 8

Best Practices

Always include provenance_ref: to ground each atom to a declared source
Use glossary: to align with a DefinedTermFragment or glossary page
Use confidence: and derived: for transparency in calculated or interpreted data
Keep arrays flat; avoid deeply nested or ambiguous YAML structures
If using a complex value:, add description: to explain structure