Semantic Protocol for Human and Earth Relational Environments

SPHERE is the organising protocol behind this site. It structures how heterogeneous datasets—registries that serve many domains at once—are decomposed into discoverable, purpose-specific units that both humans and AI agents can navigate.

The Problem

A registry like CVR contains 27 object classes and 361 attributes. It serves questions about industry classification, spatial footprint, ownership structure, financial health, and more. No single description captures all of these purposes. An agent asking “where are Copenhagen’s restaurants?” needs different metadata than one asking “who owns this company?”—yet both draw from the same registry.

Flat dataset catalogs force a choice: either one monolithic description per registry (too broad) or one page per entity (too low-level). Neither matches how questions are actually asked.

The Tree-Crown Metaphor

SPHERE models data as a tree crown:

ElementRoleExample
Branch (Sphere)Thematic domain—a cluster of related phenomenaSocio-Technical, Hydrosphere, Biosphere
Twig (Subsphere)Narrower domain within a sphereGovernance, Infrastructure, Socioeconomic
LeafA semantic aspect of a dataset—answers one kind of question”Business Industry Classification”
ThreadCross-domain connection where a leaf is relevant to multiple spheresIndustry codes link Socio-Technical to Biosphere (environmental impact)

The current protocol uses 7 branches: Atmosphere, Biosphere, Hydrosphere, Lithosphere, Pedosphere, Socio-Technical, and Toposphere.

How it works

  1. Spheres group phenomena by theme (→ SPHERE)
  2. Leaves decompose registries into purpose-specific semantic aspects (→ Leaves)
  3. Each leaf points down to Dataset documentation (→ Datasets by Collection) and Services (GraphQL, WFS, file download)
  4. Threads tag leaves that cross sphere boundaries, enabling lateral discovery

The Architecture Chain

Leaf (semantic aspect, answers a question type)
  → Dataset (registry info, license, temporal model)
    → Service (GraphQL endpoint, WFS, file download, API key)

An agent navigating SPHERE:

  1. Identifies the sphere matching its question domain
  2. Browses leaves within that sphere
  3. Follows the leaf’s pointers to dataset documentation for join logic and entity details
  4. Uses service references to access the actual data

Leaf Schema

Every leaf page uses this frontmatter structure:

---
title: <human-readable leaf name>
type: leaf
draft: false
sphere: <primary sphere>
subsphere: <subsphere within the primary sphere>
collection: <source data collection/registry>
concept: <semantic concept from the collection's semantic organisation>
question: <what kind of question does this leaf help answer?>
entities:
 
  - <source entity 1>
  - <source entity 2>
 
key-attributes:
 
  - <most important attributes for this aspect>
 
threads:
 
  - <cross-domain thread ids this leaf connects through>
 
tags:
 
  - collection/<collection-slug>
 
---

Threads: Cross-Domain Connections

Many phenomena sit at the boundary between spheres. Industry codes in CVR classify economic activity (Socio-Technical) but also signal environmental impact (Biosphere) and thematic human partitioning of space (Socio-Technical Perception & Thematics). Rather than duplicating leaves, SPHERE uses threads—explicit cross-domain relations that connect leaves to other twigs.

An agent exploring the Biosphere can discover CVR industry classification through thread traversal, without CVR needing to “belong” to Biosphere.

For the current graph-first conventions with multi-twig leaves and explicit threads, see SPHERE Thread Conventions.

Backward Compatibility

The existing 338+ pages in Datasets by Collection and SPHERE remain unchanged. They form the Dataset tier of the architecture chain. Leaves add a new semantic layer on top, linking down to those pages. Nothing is removed or reorganised.

Classical Metadata Standards

Three widely-used geospatial classification systems — ISO 19115, INSPIRE, and UN-GGIM — define flat lists of topic categories. SPHERE maps all 67 classical themes into the tree hierarchy, providing a Rosetta Stone between legacy catalog keywords and the sphere/subsphere/thread structure. This enables systematic harvesting from existing metadata servers.

See Classical Classifications → SPHERE Mapping for the full concordance.

Starting pointPath
I have a question about a domainSPHERE → find sphere → browse leaves
I know the registryDatasets by Collection → registry page → follow leaf backlinks
I want to browse all semantic aspects|All Leaves
I have a classical metadata keyword (ISO/INSPIRE/UN-GGIM)|Classical Classifications → find theme → follow to sphere
I want the protocol definitionYou are here

Status

SPHERE is in pilot phase. The first leaves cover CVR as a proof of concept, decomposing it into four semantic aspects: industry classification, spatial footprint, ownership structure, and financial profile.

Machine-Readable Discovery

For programmatic agent access, the full sphere/leaf/thread graph is available as a JSON index:

This file contains all spheres, leaves with their entities and key attributes, service pointers, and thread cross-references. Agents can fetch this single file to discover everything available.