Content from Introduction to Salmon Knowledge Modelling

Last updated on 2025-11-11 | Edit this page

Overview

Questions

What are controlled vocabularies and why are they important for data interoperability?

Objectives

This is a Carpentry-style, hands-on workshop. Each module builds on your own data and progresses from discovering terms you already use → documenting them clearly → aligning them with others.

Introduction

This workshop helps participants collaboratively develop, document, and align controlled vocabularies to improve data interoperability in salmon research and management. It emphasizes practical, community-centered steps that support long-term reusability and transparency, while remaining adaptable to other organizations or domains.

Why Controlled Vocabularies Matter

Inconsistent terminology prevents data integration and makes shared understanding difficult across agencies, researchers, and Indigenous knowledge systems. Controlled vocabularies address this by:

Capturing and standardizing the meaning of key terms
Enabling clear documentation and communication
Forming the foundation for ontologies and semantic integration

Key Points

By the end of the first three modules, participants will have:

Discovered and reused existing terms and URIs.
Created clear definitions and documentation for local data.
Built a mapping table connecting their terms to others’.

Content from Reusing Terms — Search and Integrate Existing Vocabularies

Last updated on 2025-11-11 | Edit this page

Overview

Questions

Are the terms I need already defined somewhere else?
How can I responsibly reuse existing terms and URIs?
What are the benefits of aligning early rather than reinventing?

Objectives

Learn how to discover and evaluate existing vocabularies relevant to your domain (e.g., Darwin Core, WoRMS, OBO ontologies).
Understand how to reuse URIs and integrate external definitions into your own data dictionary.
Practice linking your data elements to authoritative terms where appropriate.

Introduction

Every dataset — whether from your lab, your agency, or another research group — uses terms to describe its contents. Column headers, variable names, and codes all hold meaning, but often those meanings are assumed rather than shared.

When everyone invents their own terms for the same concept (e.g., SmoltCond, ConditionFactor, CF), it becomes difficult to integrate or compare data across projects.

Reusing existing terms — with clear definitions and persistent identifiers (URIs) — makes your data:

Easier to share and integrate
More interoperable and transparent
Aligned with others in your community
Future-proof for modeling and ontology building

This session helps you learn where to find existing vocabularies, how to decide what to reuse, and how to incorporate those terms into your own data dictionary.

Callout

🧩 Core Ideas

Term reuse means adopting existing, well-defined concepts instead of inventing new ones.

Each reused term has a URI (Uniform Resource Identifier) that makes it globally recognizable.

Reusing does not mean losing your local context — you can still describe how your project uses a term, while referencing a shared definition.

This is a key first step in making your data “semantic” — meaning it can be understood by both humans and machines.

Discussion

Challenge 1: Find and reuse (30 min)

Goal: Identify existing vocabulary terms that match your own dataset.

Steps:

Select 3–5 column names from your dataset.
Search for equivalent terms in one or more repositories.
Record matches in the Data Dictionary Template:

Your local term
External URI
Source vocabulary name
Notes on whether it’s an exact or close match

Expected Outputs

Updated data dictionary with at least three reused terms and their URIs.

Learners understand how to find, evaluate, and record external vocabularies.

Key Points

Controlled vocabularies capture shared meaning of terms.
Reusing existing URIs improves interoperability and credibility.
Reuse saves time, avoids duplication, and makes future integration easier.

Content from Documenting Terms — Write Clear, Useful Definitions

Last updated on 2025-11-11 | Edit this page

Overview

Questions

How can I make sure others understand and correctly use my terms?
What makes a good definition or label?
How should I record units, examples, and relationships between terms?

Objectives

Extract and describe terms from their dataset.
Write unambiguous, well-structured definitions.
Record associated metadata (units, codes, examples).

Introduction

You’ve identified the key terms used in your datasets — and maybe even found some existing ones to reuse. Now comes the part that makes your work understandable, trustworthy, and reusable: clear documentation.

Inconsistent or missing definitions are one of the biggest barriers to data reuse. For example:

What does “sample date” really mean — collection date, processing date, or submission date?

Does “juvenile” refer to an age class, a length range, or a life stage?

What are the units? Are they consistent across datasets?

This session will help you document your terms precisely, so anyone — whether a collaborator, data manager, or future researcher — can understand exactly what you meant.

Callout

🧩 Core Ideas

Documentation is data. It’s the layer that makes data understandable and reusable.

A well-documented term includes:

Preferred label: the human-readable name.
Definition: what the term means and how it’s used.
Units or scale: how it’s measured.
Example values: what typical data look like.

Notes: clarifications, special cases, or links to other terms.

Think of your data dictionary as a user manual for your dataset.

Example

Term	Definition	Units	Example	Notes
Condition factor	A measure of fish body condition, typically calculated as weight/length³.	dimensionless	1.05	Used as an indicator of energy reserves at smolt stage.
Smolt age	The age (in years) of a salmon when it migrates from freshwater to the ocean.	years	2	Derived from scale analysis.
Capture date	The date when a specimen was physically collected in the field.	ISO 8601 (YYYY-MM-DD)	2023-05-14	Not to be confused with processing or tagging date.

The more clearly you describe your terms now, the easier it becomes to share, integrate, and align your data later — especially when mapping to vocabularies or building ontologies.

Discussion

Challenge 1: Extract and define (40 min)

Goal: Create clear, consistent documentation for your own dataset terms.

Review your dataset and list 10–15 column names. Record in a shared data dictionary template (CSV):

Label (term name)
Definition (clear, context-rich description)
Units or codes used
Example value(s)
Notes on ambiguity or uncertainty

Expected Outputs

A draft data dictionary covering at least 10 key terms.
Peer-reviewed feedback on definition clarity.
Improved awareness of semantic gaps in existing data.

Key Points

A data dictionary is the bridge between raw data and understanding.
Good definitions reduce misinterpretation and support machine processing.
Documentation is both a social and technical task.

Content from Concept Decomposition

Last updated on 2025-11-11 | Edit this page

Overview

Questions

What are the components that make up a concept?
How do I tell when two terms are the same, related, or overlapping?
What patterns or relationships exist among my documented terms?
How can I show these relationships clearly?

Objectives

Decompose complex concepts into simpler, more explicit parts.
Identify relationships (e.g., broader, narrower, related) among terms.
Use visual mapping to show how concepts connect.
Prepare a set of refined concepts that can be formalized in a schema.

Introduction

Now that your terms are well-defined and documented, the next step is to look beneath the surface — to unpack how those terms relate to one another.

This process, called concept decomposition, helps you:

See what each concept really means.
Identify overlaps or hidden distinctions between terms.
Prepare for formal modeling (where meaning becomes machine-readable).

For example:
The term “juvenile salmon” might seem simple — until you realize it includes age, size, and life stage. By decomposing it into parts (“life stage: juvenile”, “species: salmon”, “habitat: freshwater”), you make the meaning explicit and ready for alignment with other datasets or vocabularies.

Callout

🧩 Core Ideas

Concept decomposition means breaking a term down into its essential pieces of meaning.
It helps you move from words → structure.
Relationships matter: knowing how one term connects to another is as important as defining it.
Visualizing your terms helps spot patterns and inconsistencies.

Example

Term	Broader Concept	Narrower Concept	Related Concept
Juvenile salmon	Salmon	Parr	Smolt
Smolt	Juvenile salmon	—	Ocean migrant
Spawning habitat	Habitat	Redd site	—

From this, we can see that Smolt is a narrower stage within Juvenile salmon, and that Spawning habitat relates to but is distinct from Redd site — these are building blocks for the next module, where we’ll start expressing these ideas formally.

Discussion

Challenge 1: Build a Mapping Table (40 min)

Pick 3–5 documented terms from their Module 2 work.
Break each term down into its essential pieces of meaning.
Identify any broader/narrower/related concepts.
Sketch a mini concept map (e.g., on whiteboard, MS Paint, or sticky notes).

Key Points

Relationships reveal meaning.
Decomposing terms uncovers hidden assumptions.
Mapping across datasets helps identify where vocabularies can be aligned.
Concept decomposition prepares you for formalization in SKOS and ontology modeling (coming next!).

Content from From Concepts to Semantics — Introducing SKOS

Last updated on 2025-11-11 | Edit this page

Overview

Questions

How do we move from lists of terms and definitions to formal, machine-readable vocabularies?
What does it mean to give a term a URI and define its relationships to others?
How can SKOS help represent our concepts and mappings in a structured, shareable way?
How do hierarchical relationships (“broader”, “narrower”, “related”) clarify meaning and enable interoperability?

Objectives

Explain the purpose of SKOS in representing controlled vocabularies.
Map existing terms and definitions from a data dictionary into basic SKOS structure (Concept, prefLabel, definition, broader, narrower, related).
Understand how SKOS differs from an ontology but connects to it (conceptual bridge).
Create a simple schema diagram showing relationships among terms, using SKOS-like semantics.

Introduction

Learners have already identified and documented terms (Modules 1–3), and developed competency questions (Module 4). This module introduces semantic structure: how to move from “terms and mappings” to “concepts and relationships” that can be shared, reused, and machine-readable.

This is the first dip into ontology thinking, using SKOS because it’s lightweight, visual, and flexible.

SKOS (Simple Knowledge Organization System) provides a lightweight, flexible way to express controlled vocabularies and their relationships using the Semantic Web.

SKOS Term	Meaning	Example (Salmon Context)
skos:Concept	A unique concept or term	“Smolt condition factor”
skos:prefLabel	The preferred human-readable label	“Condition factor”
skos:definition	Text definition of the concept	“A measure of body condition calculated as weight/length³”
skos:broader	More general concept	“Smolt condition factor” broader: “Condition metric”
skos:narrower	More specific concept	“Smolt condition factor” narrower: “Fork length condition factor”
skos:related	Related but not hierarchical concept	“Condition factor” related to: “Smolt age”
skos:exactMatch, skos:closeMatch	Crosswalk to another vocabulary	“Condition factor” exactMatch: https://vocab.nerc.ac.uk/condition_factor/

SKOS helps structure your data terms before you build an ontology — it’s a bridge between documentation and formal reasoning.

Discussion

Challenge 1: From Data Dictionary to SKOS (25 min)

Purpose: Practice turning natural-language data terms into formal SKOS concepts.

Instructions:

Take 3–5 terms from your data dictionary (Modules 1–3).
For each term, fill in:

Preferred label
Definition (or short description)
Broader / narrower / related concepts (if applicable)
Equivalent or similar terms in another dataset or vocabulary

Assign a temporary URI (e.g., https://example.org/salmon/condition_factor).
Note which relationships are uncertain or need discussion.

🧠 Tip: You don’t need RDF syntax yet — the goal is concept structure, not code.

Show details

Concept	PrefLabel	Definition	Broader	Related URI
Smolt condition factor	Condition factor	Weight/length³, used as an indicator of fish health Condition metric	Smolt length	https://vocab.salmon.org/SmoltConditionFactor

Discussion

Challenge 2: Build a Simple Schema Diagram (20 min)

Purpose: Visualize how your SKOS concepts relate to one another.

Instructions:

On a whiteboard or digital diagram tool (e.g., MS PowerPoint, Google Slides, MS Paint, paper):

Draw boxes for each concept.
Connect them with arrows labeled broader, narrower, or related.

Check:

Is the hierarchy logical (no circular relationships)?
Are broader/narrower concepts consistent in scope?
Where could you reuse existing concepts from other vocabularies?

Optional: Add color or icons to distinguish reused vs. new concepts.

Key Points

SKOS helps bridge informal definitions and formal semantics.
It supports controlled vocabularies that can later evolve into ontologies.
Creating a schema diagram helps visualize and communicate conceptual structure.
Reusing terms and clearly defining relationships builds semantic interoperability.

Content from From Terms to Meaning - Framing Knowledge with Competency Questions

Last updated on 2025-11-11 | Edit this page

Overview

Questions

What is a Competency Question (CQ) and how does it help in ontology development?

Objectives

Explain what a Competency Question (CQ) is and why it’s useful in ontology development.
Formulate domain-relevant CQs that reveal how concepts connect and what data relationships matter.
Use CQs to guide vocabulary refinement and early ontology design.
Understand how CQs validate whether a knowledge model meets its intended purpose.

Introduction

Why Competency Questions?

Think of CQs as the “user stories” of ontology design — they describe what users (researchers, managers, etc.) need to know or compare, and ensure your data terms and structures can support those needs.

They help you: - Focus on purpose-driven vocabulary development - Identify data gaps early - Build alignment between scientists, data managers, and modelers

Example: Salmon Data Integration Context

Imagine you have multiple datasets on sockeye salmon:

Fraser River dataset: smolt length, weight, and ocean entry date
Bristol Bay dataset: similar metrics, but uses different column names and sampling protocols

Possible Competency Questions might be:

“Is the average smolt condition at ocean entry higher in one population than another?”
“Do differences in smolt condition explain variation in adult return rates?”

From these questions, you can see what concepts need alignment: condition factor, smolt stage, population, region, and return abundance.

Discussion

Challenge 1: Identify decision points

Goal: Draft and refine CQs that reflect the research or management needs represented by your data.

Steps:

Review your vocabulary terms or data dictionary from earlier modules.
In small groups, brainstorm 3–5 natural-language questions that:

Are answerable using your data (or could be if integrated).
Require multiple terms or relationships to answer.
Reflect real research or management scenarios.

Write each question on a sticky note or digital card.
Group similar questions and discuss:

Which terms appear most often?
What relationships are implied?

Discussion

Challenge 2: Write your own competency questions

Goal: Identify which terms and relationships are needed to answer each CQ.

Instructions:

In small groups or pairs, write 2–3 CQs that your data integration or modeling efforts should be able to answer.

Focus on specific, realistic, and answerable questions — avoid vague ones like “What is salmon health?”

Check your questions:

Are key concepts clearly defined?
Do you know what data source could answer it?
What relationships would your ontology need to represent?

🧩 Example Revision:

Too broad: “What affects salmon survival?”

Better: “Does smolt condition at ocean entry affect adult return rates by region?”

Discussion

Challenge 3: Connect CQs to terms

Using your data dictionary from Modules 1–3:

Highlight which terms appear in your CQs.
Identify any missing terms or unclear definitions.
Note which terms might need alignment across datasets (e.g., “region,” “population,” “condition”).

Key Points

Competency Questions express the intended use of an ontology in natural language.
They help translate real-world research and management questions into conceptual structures.
CQs are iterative, evolving as you refine your vocabulary and build your ontology.
Good CQs are specific, testable, and connected to real data needs.

Content from Bonus Session

Last updated on 2025-11-14 | Edit this page

Ontology Game Workshop

Making Sense of Salmon Research Data

Overview

Questions

What is an ontology, and how does it differ from a data dictionary?
Why does salmon research data need clearer semantics?
What challenges arise when different people organize the same vocabulary?

Objectives

Define “ontology” in the context of data integration.
Recognize data ambiguity problems common in salmon science.
Connect these problems to ontology-based solutions.
Recognize why implicit structures must become explicit in an ontology.
Reflect on the need for controlled vocabularies.

Introduction

An ontology acts as the database schema or rule book for your knowledge model. It defines:

Types of entities (nodes): “Person,” “Film,” “Genre.”

Types of relationships (edges): A “Person” can “direct” or “star in” a “Film.”

Properties for entities and relationships: A “Person” has a “name” and “birth year.”

Ontologies provide foundational rules and structure, ensuring consistent interpretation by both humans and machines.

Understanding transitive properties

A transitive property is a relationship where if A relates to B, and B relates to C, then A relates to C.

Here is an example of how different life stages of salmon relate to spawning events, using a few clear classes and one transitive property.

We often record salmon data at different points in their life cycle — for example, a smolt Migration Event one year and a Spawning Event several years later.

By using a transitive property like hasLifeStageEvent, we can reason that the same Stock is connected across all those events.

Transitive properties help us:

Represent hierarchies (e.g., Species → Population → Individual).
Capture temporal or process chains (e.g., Smolt → Adult → Spawner).
Enable reasoning that connects related concepts without manually writing every link.

Callout

Stock_A hasLifeStageEvent SmoltMigration_2022 .

SmoltMigration_2022 hasLifeStageEvent Spawning_2025 .

thus

Stock_A hasLifeStageEvent Spawning_2025 .

Discussion

Sorting the Vocabulary Soup (20 min)

Goal: Experience how people intuitively categorize domain concepts — and how different those categories can be.

Materials

Card sets with one term per card (20–30 total per table)
Example terms:
water temperature, age, length, weight, life stage, spawn date, smolt, tag ID, river reach, capture event, habitat type, species, sex, growth rate, migration timing
Timer (10–15 minutes)
Table space or wall for grouping

Instructions

Each group gets a shuffled deck of term cards.
Ask them to organize terms into groups that make sense to them.
No rules — they can group by theme, data type, biological scale, etc.
Once grouped, have each group name their categories.
Optional: groups walk around and view each others’ arrangements.

Implicit meanings of compound terms

We often see compound terms within dataset columns, yet compound terms often embed multiple concepts. For example, “Mark-recapture escapement estimate” includes:

Entity: population
characteristic: escapement
Measurement Method: mark-recapture

We cannot express these implicit meanings inside the table without first decomposing the term. Understanding these components helps clarify meaning and supports data integration.

Discussion

Dissecting Salmon Terms (20 min)

Goal: Reveal the hidden components and embedded meanings in compound terms.

Materials

Each group receives 4–6 “compound” term cards.
Examples:
- Life stage
- Spawner abundance
- Tag detection event
- Smolt-to-adult return rate (SAR)
- Migration success
- Egg-to-fry survival
Blank mini-cards or sticky notes
Pens or markers
Labels for concepts: Entity, Property, Process, Event, Assertion, etc.

Instructions

Each group breaks down each compound term into its atomic concepts.
- Example:
  Life stage → organism + developmental phase + (implied) habitat + age range
  Tag detection event → tagged individual + receiver + location + time
Write each sub-concept on separate sticky notes.
Label the type of each component (e.g. property, process).
Optional challenge: Which team can identify the most distinct sub-concepts in 5 minutes?

Discussion

Build a Mini-Ontology (20 min)

Goal: Experience how explicit relationships can organize knowledge.

Materials

Blank cards or sticky notes (reuse from previous task)
Printed relationship arrows or connectors labeled:
- is a
- has property
- occurs at
- involves
- measured in
- related to
Optional: string or tape to connect items on a wall or table

Instructions

Using the decomposed concepts from Task 2, groups now connect them into a network using relationship arrows.
- Example:
  Tag detection event involves tagged individual
  Tag detection event occurs at location
  location has property river reach
Encouraged to build small hierarchies (e.g., smolt is a juvenile is a life stage).
Optional: introduce a “data integration twist” — merge two groups’ mini-ontologies and reconcile overlapping terms.

Key Points

Ontologies go beyond vocabulary—they structure meaning.
Shared semantics make integration and reuse possible.
Even small conceptual differences can block interoperability.