Documenting Terms — Write Clear, Useful Definitions
Last updated on 2025-11-11 | Edit this page
Overview
Questions
- How can I make sure others understand and correctly use my terms?
- What makes a good definition or label?
- How should I record units, examples, and relationships between terms?
Objectives
- Extract and describe terms from their dataset.
- Write unambiguous, well-structured definitions.
- Record associated metadata (units, codes, examples).
Introduction
You’ve identified the key terms used in your datasets — and maybe even found some existing ones to reuse. Now comes the part that makes your work understandable, trustworthy, and reusable: clear documentation.
Inconsistent or missing definitions are one of the biggest barriers to data reuse. For example:
What does “sample date” really mean — collection date, processing date, or submission date?
Does “juvenile” refer to an age class, a length range, or a life stage?
What are the units? Are they consistent across datasets?
This session will help you document your terms precisely, so anyone — whether a collaborator, data manager, or future researcher — can understand exactly what you meant.
🧩 Core Ideas
Documentation is data. It’s the layer that makes data understandable and reusable.
A well-documented term includes:
- Preferred label: the human-readable name.
- Definition: what the term means and how it’s used.
- Units or scale: how it’s measured.
- Example values: what typical data look like.
Notes: clarifications, special cases, or links to other terms.
Think of your data dictionary as a user manual for your dataset.
Example
| Term | Definition | Units | Example | Notes |
|---|---|---|---|---|
| Condition factor | A measure of fish body condition, typically calculated as weight/length³. | dimensionless | 1.05 | Used as an indicator of energy reserves at smolt stage. |
| Smolt age | The age (in years) of a salmon when it migrates from freshwater to the ocean. | years | 2 | Derived from scale analysis. |
| Capture date | The date when a specimen was physically collected in the field. | ISO 8601 (YYYY-MM-DD) | 2023-05-14 | Not to be confused with processing or tagging date. |
The more clearly you describe your terms now, the easier it becomes to share, integrate, and align your data later — especially when mapping to vocabularies or building ontologies.
Challenge 1: Extract and define (40 min)
Goal: Create clear, consistent documentation for your own dataset terms.
Review your dataset and list 10–15 column names. Record in a shared data dictionary template (CSV):
Label (term name)
Definition (clear, context-rich description)
Units or codes used
Example value(s)
Notes on ambiguity or uncertainty
A draft data dictionary covering at least 10 key terms.
Peer-reviewed feedback on definition clarity.
Improved awareness of semantic gaps in existing data.
- A data dictionary is the bridge between raw data and understanding.
- Good definitions reduce misinterpretation and support machine processing.
- Documentation is both a social and technical task.