Content from Introduction to Salmon Knowledge Modelling


Last updated on 2025-11-11 | Edit this page

Overview

Questions

  • What are controlled vocabularies and why are they important for data interoperability?

Objectives

This is a Carpentry-style, hands-on workshop. Each module builds on your own data and progresses from discovering terms you already use → documenting them clearly → aligning them with others.

Introduction


This workshop helps participants collaboratively develop, document, and align controlled vocabularies to improve data interoperability in salmon research and management. It emphasizes practical, community-centered steps that support long-term reusability and transparency, while remaining adaptable to other organizations or domains.

Why Controlled Vocabularies Matter

Inconsistent terminology prevents data integration and makes shared understanding difficult across agencies, researchers, and Indigenous knowledge systems. Controlled vocabularies address this by:

  • Capturing and standardizing the meaning of key terms

  • Enabling clear documentation and communication

  • Forming the foundation for ontologies and semantic integration

Key Points

By the end of the first three modules, participants will have:

  • Discovered and reused existing terms and URIs.
  • Created clear definitions and documentation for local data.
  • Built a mapping table connecting their terms to others’.

Content from Reusing Terms — Search and Integrate Existing Vocabularies


Last updated on 2025-11-11 | Edit this page

Overview

Questions

  • Are the terms I need already defined somewhere else?
  • How can I responsibly reuse existing terms and URIs?
  • What are the benefits of aligning early rather than reinventing?

Objectives

  • Learn how to discover and evaluate existing vocabularies relevant to your domain (e.g., Darwin Core, WoRMS, OBO ontologies).
  • Understand how to reuse URIs and integrate external definitions into your own data dictionary.
  • Practice linking your data elements to authoritative terms where appropriate.

Introduction


Every dataset — whether from your lab, your agency, or another research group — uses terms to describe its contents. Column headers, variable names, and codes all hold meaning, but often those meanings are assumed rather than shared.

When everyone invents their own terms for the same concept (e.g., SmoltCond, ConditionFactor, CF), it becomes difficult to integrate or compare data across projects.

Reusing existing terms — with clear definitions and persistent identifiers (URIs) — makes your data:

  • Easier to share and integrate

  • More interoperable and transparent

  • Aligned with others in your community

  • Future-proof for modeling and ontology building

This session helps you learn where to find existing vocabularies, how to decide what to reuse, and how to incorporate those terms into your own data dictionary.

Callout

🧩 Core Ideas

Term reuse means adopting existing, well-defined concepts instead of inventing new ones.

Each reused term has a URI (Uniform Resource Identifier) that makes it globally recognizable.

Reusing does not mean losing your local context — you can still describe how your project uses a term, while referencing a shared definition.

This is a key first step in making your data “semantic” — meaning it can be understood by both humans and machines.

Discussion

Challenge 1: Find and reuse (30 min)

Goal: Identify existing vocabulary terms that match your own dataset.

Steps:

  1. Select 3–5 column names from your dataset.

  2. Search for equivalent terms in one or more repositories.

  3. Record matches in the Data Dictionary Template:

  • Your local term
  • External URI
  • Source vocabulary name
  • Notes on whether it’s an exact or close match

Updated data dictionary with at least three reused terms and their URIs.

Learners understand how to find, evaluate, and record external vocabularies.

Key Points
  • Controlled vocabularies capture shared meaning of terms.
  • Reusing existing URIs improves interoperability and credibility.
  • Reuse saves time, avoids duplication, and makes future integration easier.

Content from Documenting Terms — Write Clear, Useful Definitions


Last updated on 2025-11-11 | Edit this page

Overview

Questions

  • How can I make sure others understand and correctly use my terms?
  • What makes a good definition or label?
  • How should I record units, examples, and relationships between terms?

Objectives

  • Extract and describe terms from their dataset.
  • Write unambiguous, well-structured definitions.
  • Record associated metadata (units, codes, examples).

Introduction


You’ve identified the key terms used in your datasets — and maybe even found some existing ones to reuse. Now comes the part that makes your work understandable, trustworthy, and reusable: clear documentation.

Inconsistent or missing definitions are one of the biggest barriers to data reuse. For example:

What does “sample date” really mean — collection date, processing date, or submission date?

Does “juvenile” refer to an age class, a length range, or a life stage?

What are the units? Are they consistent across datasets?

This session will help you document your terms precisely, so anyone — whether a collaborator, data manager, or future researcher — can understand exactly what you meant.

Callout

🧩 Core Ideas

Documentation is data. It’s the layer that makes data understandable and reusable.

A well-documented term includes:

  • Preferred label: the human-readable name.
  • Definition: what the term means and how it’s used.
  • Units or scale: how it’s measured.
  • Example values: what typical data look like.

Notes: clarifications, special cases, or links to other terms.

Think of your data dictionary as a user manual for your dataset.

Example

Term Definition Units Example Notes
Condition factor A measure of fish body condition, typically calculated as weight/length³. dimensionless 1.05 Used as an indicator of energy reserves at smolt stage.
Smolt age The age (in years) of a salmon when it migrates from freshwater to the ocean. years 2 Derived from scale analysis.
Capture date The date when a specimen was physically collected in the field. ISO 8601 (YYYY-MM-DD) 2023-05-14 Not to be confused with processing or tagging date.

The more clearly you describe your terms now, the easier it becomes to share, integrate, and align your data later — especially when mapping to vocabularies or building ontologies.

Discussion

Challenge 1: Extract and define (40 min)

Goal: Create clear, consistent documentation for your own dataset terms.

Review your dataset and list 10–15 column names. Record in a shared data dictionary template (CSV):

  • Label (term name)

  • Definition (clear, context-rich description)

  • Units or codes used

  • Example value(s)

  • Notes on ambiguity or uncertainty

  • A draft data dictionary covering at least 10 key terms.

  • Peer-reviewed feedback on definition clarity.

  • Improved awareness of semantic gaps in existing data.

Key Points
  • A data dictionary is the bridge between raw data and understanding.
  • Good definitions reduce misinterpretation and support machine processing.
  • Documentation is both a social and technical task.

Content from Concept Decomposition


Last updated on 2025-11-11 | Edit this page

Overview

Questions

  • What are the components that make up a concept?
  • How do I tell when two terms are the same, related, or overlapping?
  • What patterns or relationships exist among my documented terms?
  • How can I show these relationships clearly?

Objectives

  • Decompose complex concepts into simpler, more explicit parts.
  • Identify relationships (e.g., broader, narrower, related) among terms.
  • Use visual mapping to show how concepts connect.
  • Prepare a set of refined concepts that can be formalized in a schema.

Introduction


Now that your terms are well-defined and documented, the next step is to look beneath the surface — to unpack how those terms relate to one another.

This process, called concept decomposition, helps you:

  • See what each concept really means.
  • Identify overlaps or hidden distinctions between terms.
  • Prepare for formal modeling (where meaning becomes machine-readable).

For example:
The term “juvenile salmon” might seem simple — until you realize it includes age, size, and life stage. By decomposing it into parts (“life stage: juvenile”, “species: salmon”, “habitat: freshwater”), you make the meaning explicit and ready for alignment with other datasets or vocabularies.

Callout

🧩 Core Ideas

  • Concept decomposition means breaking a term down into its essential pieces of meaning.

  • It helps you move from words → structure.

  • Relationships matter: knowing how one term connects to another is as important as defining it.

  • Visualizing your terms helps spot patterns and inconsistencies.

Example

Term Broader Concept Narrower Concept Related Concept
Juvenile salmon Salmon Parr Smolt
Smolt Juvenile salmon Ocean migrant
Spawning habitat Habitat Redd site

From this, we can see that Smolt is a narrower stage within Juvenile salmon, and that Spawning habitat relates to but is distinct from Redd site — these are building blocks for the next module, where we’ll start expressing these ideas formally.

Discussion

Challenge 1: Build a Mapping Table (40 min)

  1. Pick 3–5 documented terms from their Module 2 work.

  2. Break each term down into its essential pieces of meaning.

  3. Identify any broader/narrower/related concepts.

  4. Sketch a mini concept map (e.g., on whiteboard, MS Paint, or sticky notes).

Key Points
  • Relationships reveal meaning.
  • Decomposing terms uncovers hidden assumptions.
  • Mapping across datasets helps identify where vocabularies can be aligned.
  • Concept decomposition prepares you for formalization in SKOS and ontology modeling (coming next!).

Content from From Concepts to Semantics — Introducing SKOS


Last updated on 2025-11-11 | Edit this page

Overview

Questions

  • How do we move from lists of terms and definitions to formal, machine-readable vocabularies?
  • What does it mean to give a term a URI and define its relationships to others?
  • How can SKOS help represent our concepts and mappings in a structured, shareable way?
  • How do hierarchical relationships (“broader”, “narrower”, “related”) clarify meaning and enable interoperability?

Objectives

  • Explain the purpose of SKOS in representing controlled vocabularies.
  • Map existing terms and definitions from a data dictionary into basic SKOS structure (Concept, prefLabel, definition, broader, narrower, related).
  • Understand how SKOS differs from an ontology but connects to it (conceptual bridge).
  • Create a simple schema diagram showing relationships among terms, using SKOS-like semantics.

Introduction


Learners have already identified and documented terms (Modules 1–3), and developed competency questions (Module 4). This module introduces semantic structure: how to move from “terms and mappings” to “concepts and relationships” that can be shared, reused, and machine-readable.

This is the first dip into ontology thinking, using SKOS because it’s lightweight, visual, and flexible.

SKOS (Simple Knowledge Organization System) provides a lightweight, flexible way to express controlled vocabularies and their relationships using the Semantic Web.

SKOS Term Meaning Example (Salmon Context)
skos:Concept A unique concept or term “Smolt condition factor”
skos:prefLabel The preferred human-readable label “Condition factor”
skos:definition Text definition of the concept “A measure of body condition calculated as weight/length³”
skos:broader More general concept “Smolt condition factor” broader: “Condition metric”
skos:narrower More specific concept “Smolt condition factor” narrower: “Fork length condition factor”
skos:related Related but not hierarchical concept “Condition factor” related to: “Smolt age”
skos:exactMatch, skos:closeMatch Crosswalk to another vocabulary “Condition factor” exactMatch: https://vocab.nerc.ac.uk/condition_factor/

SKOS helps structure your data terms before you build an ontology — it’s a bridge between documentation and formal reasoning.

Discussion

Challenge 1: From Data Dictionary to SKOS (25 min)

Purpose: Practice turning natural-language data terms into formal SKOS concepts.

Instructions:

  1. Take 3–5 terms from your data dictionary (Modules 1–3).
  2. For each term, fill in:
  • Preferred label
  • Definition (or short description)
  • Broader / narrower / related concepts (if applicable)
  • Equivalent or similar terms in another dataset or vocabulary
  1. Assign a temporary URI (e.g., https://example.org/salmon/condition_factor).

  2. Note which relationships are uncertain or need discussion.

🧠 Tip: You don’t need RDF syntax yet — the goal is concept structure, not code.

Concept PrefLabel Definition Broader Related URI
Smolt condition factor Condition factor Weight/length³, used as an indicator of fish health Condition metric Smolt length https://vocab.salmon.org/SmoltConditionFactor
Discussion

Challenge 2: Build a Simple Schema Diagram (20 min)

Purpose: Visualize how your SKOS concepts relate to one another.

Instructions:

  1. On a whiteboard or digital diagram tool (e.g., MS PowerPoint, Google Slides, MS Paint, paper):
  • Draw boxes for each concept.
  • Connect them with arrows labeled broader, narrower, or related.
  1. Check:
  • Is the hierarchy logical (no circular relationships)?
  • Are broader/narrower concepts consistent in scope?
  • Where could you reuse existing concepts from other vocabularies?
  1. Optional: Add color or icons to distinguish reused vs. new concepts.
Key Points
  • SKOS helps bridge informal definitions and formal semantics.
  • It supports controlled vocabularies that can later evolve into ontologies.
  • Creating a schema diagram helps visualize and communicate conceptual structure.
  • Reusing terms and clearly defining relationships builds semantic interoperability.

Content from From Terms to Meaning - Framing Knowledge with Competency Questions


Last updated on 2025-11-11 | Edit this page

Overview

Questions

  • What is a Competency Question (CQ) and how does it help in ontology development?

Objectives

  • Explain what a Competency Question (CQ) is and why it’s useful in ontology development.
  • Formulate domain-relevant CQs that reveal how concepts connect and what data relationships matter.
  • Use CQs to guide vocabulary refinement and early ontology design.
  • Understand how CQs validate whether a knowledge model meets its intended purpose.

Introduction


Why Competency Questions?

Think of CQs as the “user stories” of ontology design — they describe what users (researchers, managers, etc.) need to know or compare, and ensure your data terms and structures can support those needs.

They help you: - Focus on purpose-driven vocabulary development - Identify data gaps early - Build alignment between scientists, data managers, and modelers

Example: Salmon Data Integration Context

Imagine you have multiple datasets on sockeye salmon:

  • Fraser River dataset: smolt length, weight, and ocean entry date
  • Bristol Bay dataset: similar metrics, but uses different column names and sampling protocols

Possible Competency Questions might be:

  • “Is the average smolt condition at ocean entry higher in one population than another?”
  • “Do differences in smolt condition explain variation in adult return rates?”

From these questions, you can see what concepts need alignment: condition factor, smolt stage, population, region, and return abundance.

Discussion

Challenge 1: Identify decision points

Goal: Draft and refine CQs that reflect the research or management needs represented by your data.

Steps:

  1. Review your vocabulary terms or data dictionary from earlier modules.

  2. In small groups, brainstorm 3–5 natural-language questions that:

  • Are answerable using your data (or could be if integrated).
  • Require multiple terms or relationships to answer.
  • Reflect real research or management scenarios.
  1. Write each question on a sticky note or digital card.

  2. Group similar questions and discuss:

  • Which terms appear most often?
  • What relationships are implied?
Discussion

Challenge 2: Write your own competency questions

Goal: Identify which terms and relationships are needed to answer each CQ.

Instructions:

In small groups or pairs, write 2–3 CQs that your data integration or modeling efforts should be able to answer.

Focus on specific, realistic, and answerable questions — avoid vague ones like “What is salmon health?”

Check your questions:

  • Are key concepts clearly defined?
  • Do you know what data source could answer it?
  • What relationships would your ontology need to represent?

🧩 Example Revision:

Too broad: “What affects salmon survival?”

Better: “Does smolt condition at ocean entry affect adult return rates by region?”

Discussion

Challenge 3: Connect CQs to terms

Using your data dictionary from Modules 1–3:

  • Highlight which terms appear in your CQs.
  • Identify any missing terms or unclear definitions.
  • Note which terms might need alignment across datasets (e.g., “region,” “population,” “condition”).
Key Points
  • Competency Questions express the intended use of an ontology in natural language.
  • They help translate real-world research and management questions into conceptual structures.
  • CQs are iterative, evolving as you refine your vocabulary and build your ontology.
  • Good CQs are specific, testable, and connected to real data needs.

Content from Bonus Session


Last updated on 2025-11-14 | Edit this page

Ontology Game Workshop

Making Sense of Salmon Research Data

Overview

Questions

  • What is an ontology, and how does it differ from a data dictionary?
  • Why does salmon research data need clearer semantics?
  • What challenges arise when different people organize the same vocabulary?

Objectives

  • Define “ontology” in the context of data integration.
  • Recognize data ambiguity problems common in salmon science.
  • Connect these problems to ontology-based solutions.
  • Recognize why implicit structures must become explicit in an ontology.
  • Reflect on the need for controlled vocabularies.

Introduction

An ontology acts as the database schema or rule book for your knowledge model. It defines:

Types of entities (nodes): “Person,” “Film,” “Genre.”

Types of relationships (edges): A “Person” can “direct” or “star in” a “Film.”

Properties for entities and relationships: A “Person” has a “name” and “birth year.”

Ontologies provide foundational rules and structure, ensuring consistent interpretation by both humans and machines.

Understanding transitive properties

A transitive property is a relationship where if A relates to B, and B relates to C, then A relates to C.

Here is an example of how different life stages of salmon relate to spawning events, using a few clear classes and one transitive property.

We often record salmon data at different points in their life cycle — for example, a smolt Migration Event one year and a Spawning Event several years later.

By using a transitive property like hasLifeStageEvent, we can reason that the same Stock is connected across all those events.

Transitive properties help us:

  • Represent hierarchies (e.g., Species → Population → Individual).

  • Capture temporal or process chains (e.g., Smolt → Adult → Spawner).

  • Enable reasoning that connects related concepts without manually writing every link.

Callout
Stock_A hasLifeStageEvent SmoltMigration_2022 .

SmoltMigration_2022 hasLifeStageEvent Spawning_2025 .

thus

Stock_A hasLifeStageEvent Spawning_2025 .
Discussion

Sorting the Vocabulary Soup (20 min)

Goal: Experience how people intuitively categorize domain concepts — and how different those categories can be.

Materials

  • Card sets with one term per card (20–30 total per table)
  • Example terms:
    water temperature, age, length, weight, life stage, spawn date, smolt, tag ID, river reach, capture event, habitat type, species, sex, growth rate, migration timing
  • Timer (10–15 minutes)
  • Table space or wall for grouping

Instructions

  1. Each group gets a shuffled deck of term cards.
  2. Ask them to organize terms into groups that make sense to them.
    No rules — they can group by theme, data type, biological scale, etc.
  3. Once grouped, have each group name their categories.
  4. Optional: groups walk around and view each others’ arrangements.

Implicit meanings of compound terms

We often see compound terms within dataset columns, yet compound terms often embed multiple concepts. For example, “Mark-recapture escapement estimate” includes:

  • Entity: population
  • characteristic: escapement
  • Measurement Method: mark-recapture

We cannot express these implicit meanings inside the table without first decomposing the term. Understanding these components helps clarify meaning and supports data integration.

Discussion

Dissecting Salmon Terms (20 min)

Goal: Reveal the hidden components and embedded meanings in compound terms.

Materials

  • Each group receives 4–6 “compound” term cards.
    Examples:
    • Life stage
    • Spawner abundance
    • Tag detection event
    • Smolt-to-adult return rate (SAR)
    • Migration success
    • Egg-to-fry survival
  • Blank mini-cards or sticky notes
  • Pens or markers
  • Labels for concepts: Entity, Property, Process, Event, Assertion, etc.

Instructions

  1. Each group breaks down each compound term into its atomic concepts.
    • Example:
      Life stage → organism + developmental phase + (implied) habitat + age range
      Tag detection event → tagged individual + receiver + location + time
  2. Write each sub-concept on separate sticky notes.
  3. Label the type of each component (e.g. property, process).
  4. Optional challenge: Which team can identify the most distinct sub-concepts in 5 minutes?
Discussion

Build a Mini-Ontology (20 min)

Goal: Experience how explicit relationships can organize knowledge.

Materials

  • Blank cards or sticky notes (reuse from previous task)
  • Printed relationship arrows or connectors labeled:
    • is a
    • has property
    • occurs at
    • involves
    • measured in
    • related to
  • Optional: string or tape to connect items on a wall or table

Instructions

  1. Using the decomposed concepts from Task 2, groups now connect them into a network using relationship arrows.
    • Example:
      Tag detection event involves tagged individual
      Tag detection event occurs at location
      location has property river reach
  2. Encouraged to build small hierarchies (e.g., smolt is a juvenile is a life stage).
  3. Optional: introduce a “data integration twist” — merge two groups’ mini-ontologies and reconcile overlapping terms.
Key Points
  • Ontologies go beyond vocabulary—they structure meaning.

  • Shared semantics make integration and reuse possible.

  • Even small conceptual differences can block interoperability.