← Default view
SPEC · OBSV-FRONTMATTER · §E.49 · ED. 2026.04

Frontmatter & tagging schemas
for extraction in Obsidian

status   STABLE
edition   2026-04-30
target   Obsidian ≥ 1.10.0
conformance   Bases · Dataview · Zod · Pydantic · Yamale
depthexpedition
citations69
read11 min
formatmd → bases
layercontract
Design for extraction first, presentation second. Bases reads only YAML — inline key:: value Dataview fields are silently invisible[5]. Use a type discriminator, flat keys (Properties has no nested objects[8]), reusable property names, and reserve hashtags for state — not topic[17]. New schemas in 2026 should target Bases. Dataview has stagnated[9].
§ 01

The reference schema

canonical / normative
YAML · book.note.frontmatter
---
type: book                          # discriminator (Bases filter axis)
title: "Foundation"
author: "Isaac Asimov"               # reusable across `type`s
created: 2026-04-30                  # Date
updated: 2026-04-30T09:00            # Date & Time
status: active                       # Text → enum
tags: [scifi, classic]                # List · plural mandatory ≥1.9
aliases: ["Asimov · Foundation"]      # reserved · list-only
related: ["[[Foundation and Empire]]"]  # quoted wikilink
source: https://en.wikipedia.org/wiki/Foundation_(novel)
rating: 9                             # Number
done: false                            # Checkbox
series_name: "Foundation"             # flat keys (no nesting)
series_num: 1
publish: false                         # selective-publish gate
---

# note body — inline state tags below the YAML
#status/active #priority/p2
typeDiscriminator. One Base per type; one template per type. The kepano default[13].
created/
updated
Date / DateTime. Bases exposes today(), now(), file.ctime[41]; Dataview exposes .year, .month[40].
statusText type, but treat as enum: validate with Zod .enum([...]), Pydantic str-Enum, Yamale enum().
tagsReserved, list-typed since 1.9[2]. Singular tag: no longer functions.
relatedWikilinks must be quoted in YAML — bare [[X]] parses as a nested list[40].
series_*Flat keys; Properties has no nested objects[8]. Namespace-prefix per type to avoid collisions.
#status/*Hashtags for state & lifecycle, never for topic. FROM #status pulls all children[20].
§ 02

Field reference table

type · cardinality · default · validator
FieldTypeCard.DefaultNotesCite
type Text (enum) required Single mandatory discriminator. One Base view filters per type. Reusable across categories. [5][13]
title Text required file.basename Free string. Useful when filename is sluggified or contains an ID prefix.
created Date required $today ISO YYYY-MM-DD. Auto-populate via Templater or Frontmatter Generator. [40][41]
updated DateTime optional $now ISO YYYY-MM-DDTHH:mm. Use file.mtime in Bases formulas if absent. [41]
status Text (enum) required "active" Lifecycle: active · archived · done. Validated as enum in all three lanes. [35][42][38]
tags List<Text> optional [] Reserved. Plural form mandatory since 1.9. lowercase + singular per item. [2][58]
aliases List<Text> optional [] Reserved. Used by Wikilink autocomplete. List-only since 1.9. [11]
cssclasses List<Text> optional [] Reserved. Per-note CSS hooks. List-only since 1.9. [4]
related List<Link> optional [] Quote wikilinks: "[[Page Name]]". Bases auto-detects; Dataview type-infers. [40][41]
rating Number optional null Integers and floats. Use Bases comparisons (, between) for ranking views. [6]
done Checkbox optional false Boolean. Backed by Properties Checkbox type. [4]
source URL optional Plain string. No native URL type — validate via Zod .url() / Pydantic HttpUrl.
publish Checkbox optional false Selective-publish gate (Quartz ExplicitPublish, opt-in pattern).
{type}_* any optional Type-namespaced flat keys: book_status, video_status, series_name. snake_case. [8][13]
§ 03

Native type system (Properties · Bases · Dataview)

primitive · example · query surface
Property typeYAMLBases surfaceDataview surface
Text status: active String filters, contains, matches Text; substring & regex matches
List tags: [a, b] Multi-value; contains-any List; .length, .contains()
Number rating: 8 Numeric ops = > < between Number; arithmetic in inline DQL
Checkbox done: true Boolean filter Boolean
Date due: 2026-05-01 today(), date("…"), range Date; .year, .month, .day
Date & Time published: 2026-04-30T09:00 now(), file.mtime Date; full YYYY-MM-DDTHH:mm:ss.nnn+ZZ
Link (reserved) related: "[[Page]]" Auto-detects wikilinks; link("…") Link; .path, .display
tags / aliases / cssclasses (reserved · list-only) Plural form mandatory since 1.9 — singular silently no-ops [2][11].

Constraint. Once a key is typed in any note, that type is vault-wide[4]. Renaming or retyping a key after the fact is the most-cited operational pain point — see § 07.

§ 04

Taxonomy patterns that survive at scale

tested in vaults > 5 years
PAT · 4A
The type discriminator
One mandatory property labels each note's kind (book, person, project, recipe). Each Base filters on it; each type gets its own template. Steph Ango's vault: 50+ category-specific templates with reusable property names[13][23].
type: book  |  type: person  |  type: recipe
PAT · 4B
Flat keys, namespaced names
Properties has no nested YAML. Instead of series.name, write series_name. Namespacing per-type (book_status vs video_status) prevents type collisions across note kinds[8].
series: { name, num }    series_name, series_num
PAT · 4C
Tags = state. MOCs/links = topic.
Reserve hashtags for cross-cutting, transient signal — lifecycle, status, priority. Topic structure goes in MOCs/links, navigation in shallow folders[17][16].
#inbox   #seedling   #evergreen   #status/active   #priority/p1
PAT · 4D
Hierarchical / namespaced tags
#project/active, #type/article — Dataview can FROM #project and pull every child; Bases can tags contains "project/"[20]. The formal version is a Library-of-Congress controlled vocabulary[64].
#project/active   #project/soon   #project/done
ApproachStrengthWeaknessExtraction-friendly
Folder-only PARA Familiar; matches productivity literature Projects/Resources collapse[19] ✗ folder is a string
Johnny.Decimal Stable addresses; pairs well with hierarchical tags[14] Hard ceiling 100 IDs (10×10)[15] ~ numeric prefix parseable, rigid
MOC (LYT) Spawn at 5+ notes on a topic[18] Discoverability depends on link discipline ~ implicit unless mirrored in a property
Hierarchical state tags Cross-cutting; queryable parent/child[20] Casing/separator hygiene[21] ✓ FROM #project pulls children
type discriminator + flat YAML One Base per type[5]; the 2026 default Up-front template work ✓ best for Bases
§ 05

Validation lanes — three implementations of the same contract

JS · Python · pipeline
JS · TS
Zod / Astro
import { defineCollection, reference, z } from 'astro:content';

const notes = defineCollection({
  schema: z.object({
    title:   z.string(),
    created: z.coerce.date(),
    tags:    z.array(z.string()),
    status:  z.enum(['active', 'archived', 'done']),
    author:  reference('authors'),
  }),
});

z.coerce.date() converts ISO → JS Date. reference() cross-validates wikilinks. Eleventy v3 mirrors via eleventyDataSchema[36]. astro-editor ⭐ 445 turns the same schema into a typed form[43].

PY
Pydantic + Yamale
⭐ 764 (Yamale)
from enum import Enum
from pydantic import BaseModel
from pydantic_yaml import parse_yaml_raw_as

class Status(str, Enum):
    active   = "active"
    archived = "archived"
    done     = "done"

class Note(BaseModel):
    title:  str
    status: Status = Status.active
    tags:   list[str] = []

pydantic_yaml round-trips via parse_yaml_raw_as()/to_yaml_str()[42]. Lighter alternative: Yamale ⭐ 764 ships str(), int(min,max), day(), enum(), regex(), include()[38].

PIPELINE
remark-lint + mdschema
⭐ 73 / 62
# mdschema · ./schema.yaml
- { name: "title",  type: string }
- { name: "created",type: date,
   format: date }
- { name: "author", optional: true,
   format: email }
- { name: "tags",   optional: true,
   type: array }

remark-lint-frontmatter-schema validates against JSON Schema in a remark pipeline; supports per-file $schema, IDE underlining, auto-fix[37]. Stack with markdownlint-cli2 via its frontMatter regex skip[44].

§ 06

Programmatic extraction tools

parser · obsidian-aware · gotcha
ToolLangStarsNotes
Generic frontmatter parsers
gray-matter JS ⭐ 4.4k De-facto JS parser; YAML/JSON/TOML/Coffee + custom engines; powers Astro, Gatsby, VitePress, TinaCMS[25]. Recent: js-yaml prototype-pollution (#178, Nov 2025); ESM/Vite SSR is not a function (#181, Feb 2026); invalid frontmatter mis-cached (#166, #174)[26].
python-frontmatter Py ⭐ 412 Pluggable handlers (YAML/JSON/TOML); v1.1.0 (Jan 2024). BOM gotcha — open files with utf-8-sig[24].
remark-frontmatter JS ⭐ 320 v5.0.0. Recognises the fence — does not parse the YAML body. Compose with vfile-matter or remark-mdx-frontmatter[27].
gray_matter (Rust) Rs Rust port of jonschlinkert's; YAML/JSON/TOML + custom engines[33].
fronma Rs ⭐ 8 Typed; YAML default; TOML/JSON behind cargo features for build-size control[34].
Obsidian-aware extractors
obsidiantools Py ⭐ 553 v0.11.0 (Jul 2025); wraps python-frontmatter; vault.get_front_matter, vault.get_tags, NetworkX graph; tag extractor strips code blocks & escaped \#; falls back to {} on invalid YAML[28][29].
obsidian-export Rs ⭐ 1.3k v25.3.0 (Mar 2025); --frontmatter=always|never; --skip-tags/--only-tags. errors on mutual [[A]]<->[[B]] embeds unless --no-recursive-embeds[30].
py-obsidianmd / pyomd Py ⭐ 312 Only library that round-trips between YAML and Dataview inline key:: value. no tagged release; explicit "back up your vault" warning[31].
The inline-fields trap. Dataview defines two metadata channels: YAML and inline key:: value[32]. A frontmatter-only extractor will silently miss the inline class. If your vault uses inline fields, use obsidiantools / pyomd or run a one-time hoist into YAML.
§ 07

Failure modes & remedies

observed in production vaults
FM-01
Tag explosion
Tagging every noun (#glue, #wood, #saw) → noise > signal.
FIX: 3–5 high-level tags per note; lowercase + singular; lean on autocomplete[58].
FM-02
Granularity ambiguity
Should this be #dog, #dogs, #pets, #mammals?
FIX: controlled vocabulary (Library-of-Congress style) + autocomplete enforcement[61][64].
FM-03
Frontmatter sprawl
200+ stale property fields from imports/plugins; no native bulk cleanup[62].
FIX: ripgrep + sd or external Python script; multi-page feature request still open[68][69].
FM-04
Metadata rot from platform
2023 Properties rollout broke existing tag-bearing frontmatter[66]; phantom tag bloat[67].
FIX: pin Obsidian version before bulk migrations; keep schema in version control.
FM-05
Breaking renames
Tag Wrangler renames are irreversible; can silently merge tags; partial-fail under sync[63].
FIX: backup vault before every rename; pause sync; spot-check after.
FM-06
Bloat-then-restart cycle
Recurring "I deleted everything and started over" posts[59].
FIX: drop influencer-driven elaborate frameworks; minimal daily journal + search outperforms[60].
FM-07
Tag dual-location collisions
Same concept tagged in YAML and inline #hashtag; Bases sees only YAML[5]; Dataview sees both[32].
FIX: pick one channel per tag-purpose; if dual, hoist inline → YAML via pyomd or a script[31].
FM-08
Vault-wide type lock-in
Once a key is typed in any note, the type is global[4]. Retyping after-the-fact is painful.
FIX: design the schema before mass adoption; namespaces (per type) prevent collisions.
§ 08

AI-assisted tagging — three converging patterns

2025 → 2026
PAT · 8A
In-vault plugins → write YAML directly
AI Tagger Universe ⭐ 84 (15+ providers; hybrid mode against existing tags; configurable depth)[45]; Auto Tag ⭐ 66 (OpenAI; whole note or selection)[56]; Metadata Auto Classifier (OpenAI; on-demand)[55]; Frontmatter Generator ⭐ 46 (deterministic JSON/JS templates underneath the LLM layer)[54].
PAT · 8B
Retrieval-augmented taxonomy DOMINANT
Constrain the LLM against an existing taxonomy. Documented 2025 workflow: Gemma 3 12B via LM Studio + AI Tagger Universe + a hand-curated Tag List note in the vault root + Auto Note Mover[46]. The canonical script: Karan Sharma's Claude-API walker that prefers existing fields over generated ones[51].
PAT · 8C
MCP-mediated agents
MarkusPfundstein/mcp-obsidian ⭐ 3.5k exposes patch_content for targeted frontmatter edits[53]; MCPVault provides AST-aware YAML preservation + list_all_tags[52]; AI Knowledge Filler (Feb 2026) — system prompt that turns Claude/GPT into a deterministic vault-ready file generator[50].
PAT · 8D
Embeddings-first plugins (read-only on schema)
Smart Connections ⭐ 4.9k (embeddings; bulk metadata in Connect Pro only)[47]; Obsidian Copilot ⭐ 6.8k (chat-with-vault; YAML is read-context only — open issue #1471 requests tag-aware QA indexing)[48][49].
§ 09

Migration order — for an existing vault

execute top → bottom
Write a master-schema note declaring every Property name + type. Version-control it[15-meta].
Pin Obsidian version before any bulk migration; tag-bearing frontmatter has been broken before[66].
Pick one channel per tag-purpose. If currently dual (YAML + inline), run a one-time hoist with pyomd[31].
Stand up Frontmatter Generator with deterministic JSON/JS templates as the validation floor[54].
Run AI Tagger Universe against a hand-curated Tag List note; hybrid mode prefers existing tags before suggesting new[45][46].
Build Bases views per type. Filter on the discriminator first; further refine with tags contains + numeric/date ops[6].
Wire CI validation via remark-lint-frontmatter-schema (JS) or Yamale (Py)[37][38].
Migrate ad-hoc tags into typed properties only once their usage pattern is stable[65].
§ 10

Conformance

the schema is the contract
queryableBases ≥ 1.10 queryableDataview validatesZod / Astro validatesPydantic validatesYamale extractsgray-matter extractspython-frontmatter extractsobsidiantools caveatno nested YAML caveatvault-wide type lock
Spec sheet · OBSV-FRONTMATTER · Edition 2026.04 · 69 cited claims · all factual lines carry [n] footnotes inline. Render: canonical (default) · atlas.
Conformance · Bases · Zod · Pydantic · Yamale