Summary and Setup

This is a new lesson built with The Carpentries Workbench.

Pre-Workshop Reading List


To get the most out of this workshop, we recommend reviewing the following materials before attending. The readings are organized by priority and will help you understand the foundational concepts we’ll be building upon.

Required Readings (2)

These essential readings provide the core foundation for understanding data mobilization and standards:

  1. The FAIR Guiding Principles for scientific data management and stewardship

    • Wilkinson, M.D. et al. (2016). Scientific Data 3, 160018
    • https://www.nature.com/articles/sdata201618
    • Why it’s important: This is the foundational paper that introduced FAIR principles (Findable, Accessible, Interoperable, Reusable) - the cornerstone of modern data mobilization
  2. Practical Data Stewardship for Salmon Biologists–A Blueprint for Domain-Specific Best Practices in Fisheries

    • Johnson, B. et al. (2024). DRAFT manuscript
    • https://br-johnson.github.io/sdm-paper/
    • Why it’s important: This pre-print provides seven practical best practices specifically for salmon data stewardship, with real-world examples and case studies from the salmon research community

These readings will deepen your understanding of key concepts:

  1. Data Mobilization Through the International Year of the Salmon Ocean Observing System

    • Johnson, B.T. and T.C.A. van der Stap (2024). N. Pac. Anadr. Fish Comm. Bull. 7: 51–60
    • https://doi.org/10.23849/npafcb7/6a4ddpde4
    • Why it’s important: Demonstrates large-scale, cross-jurisdictional data integration efforts in salmon science through the International Year of the Salmon program
  2. Salmon Data Mobilization

    • Diack, G., T. Bird, S.A. Akenhead, J. Bayer, D. Brophy, C. Bull, E. de Eyto, B.T. Johnson, M.B. Jones, A. Knight, M. Nevoux, T. van der Stap, and A. Walker (2024). N. Pac. Anadr. Fish Comm. Bull. 7: 61–76
    • https://doi.org/10.23849/npafcb7/x3rlpo23a
    • Why it’s important: Provides a comprehensive strategy for salmon data mobilization across three spheres of agencies and practitioners, with practical guidance for the salmon research community
  3. Darwin Core: A Biodiversity Data Standard

    • TDWG (Biodiversity Information Standards)
    • https://dwc.tdwg.org/
    • Why it’s important: Darwin Core is one of the most widely-used biological data standards and provides a concrete example of how controlled vocabularies work in practice
  4. Climate and Forecast (CF) Metadata Conventions

    • CF Conventions Committee
    • http://cfconventions.org/
    • Why it’s important: Shows how climate data is standardized, which is crucial for understanding environmental drivers of salmon populations
  5. Controlled Vocabularies: A Guide to Terminology and Usage

  6. Data Standards: A Crash Course

  7. Linked Data Vocabulary Management

  8. Towards a Shared Framework: A Classificatory Matrix for Teaching Data Standards

Optional Readings (7)

For those who want to dive deeper into specific topics:

  1. Making Metadata Machine-Readable as the First Step to Providing Findable, Accessible, Interoperable, and Reusable Population Health Data

  2. A Guide to Developing Harmonized Research Workflows in a Team Science Context

  3. Building a Unified Medical Vocabulary Framework Aligned with OMOP CDM

  4. Ontology-Enriched Specifications Enabling Findable, Accessible, Interoperable, and Reusable Marine Metagenomic Datasets

  5. What is an Ontology?

  6. Principles of Data Interoperability

  7. The Environment Ontology: Contextualising Biological and Biomedical Entities

Data Sets


Download the data zip file and unzip it to your Desktop

Software Setup


This workshop will use several tools for data mobilization, controlled vocabularies, and knowledge modeling. Please install the following software before attending:

Required Software