Fundamentals in Data Management for Qualitative and Quantitative Arctic Research
About
Schedule
Code of Conduct
About this book
1
Welcome and Introductions
1.1
Introduction to the Arctic Data Center and NSF Standards and Policies
1.1.1
Learning Objectives
1.1.2
Arctic Data Center - History and Introduction
1.1.3
Data Discovery Portal
1.1.4
Tools and Infrastructure
1.1.5
Support Services
1.1.6
Training and Outreach
1.1.7
Data Rescue
1.1.8
Who Must Submit
1.1.9
Summary
1.2
RStudio Setup
1.2.1
Learning Objectives
1.2.2
Logging into the RStudio server
1.2.3
Why use an R project?
1.2.4
Understand how to use paths and working directories
1.2.5
Organizing your project
1.2.6
Summary
1.2.7
Supplemental Objectives
1.2.8
Setting up the R environment on your local computer
2
Introduction to Programming in R
2.1
Learning Objectives
2.2
Introduction and Motivation
2.2.1
R Resources
2.3
RStudio Interface
2.4
R Basics: Running code in the Console
2.4.1
Objects in R
2.4.2
Naming Conventions
2.4.3
R calculations with objects
2.4.4
Logical operators and expressions
2.4.5
Data structures in R
2.4.6
Data types in R
2.4.7
Clearing the environment
2.5
Running code in an R script
2.5.1
R Functions
2.5.2
Running code in an R script
2.5.3
Use the
mean()
function to run a more complex calculation
2.5.4
Use the
read.csv()
function to read a file into R
2.5.5
Use
read.csv()
to read in Arctic Data Center data
2.5.6
Using
data.frames
2.6
Getting help using help pages
2.7
Error messages are your friends
2.8
R Packages
3
Collaboration and Thinking Preferences
3.1
Thinking preferences
3.1.1
Learning Objectives
3.1.2
Thinking Preferences Activity
3.1.3
About the Whole Brain Thinking System
3.2
Developing a Code of Conduct
3.3
Authorship and Credit Policies
3.4
Data Sharing and Reuse Policies
3.4.1
Community Principles: CARE and FAIR
3.5
Research Data Publishing Ethics
Discussion: Data publishing policies
3.6
Extra Reading
4
Writing Data Management Plans
4.1
Writing Good Data Management Plans
4.1.1
Learning Objectives
4.1.2
When to Plan: The Data Life Cycle
4.1.3
Why Plan?
4.1.4
How to Plan
4.1.5
What to include in a DMP
4.1.6
NSF DMP requirements
4.1.7
Tools in Support of Creating a DMP
4.1.8
Arctic Data Center Support for DMPs
4.1.9
Sharing Your DMP
4.1.10
Additional Resources
5
Introduction to RMarkdown
5.1
Literate Analysis with RMarkdown
5.1.1
Learning Objectives
5.1.2
Introduction and motivation
5.1.3
Basic RMarkdown syntax
5.1.4
Code chunks
5.1.5
Literate analysis practice
5.1.6
RMarkdown and environments
5.1.7
Go Further
5.1.8
Resources
5.1.9
Troubleshooting
6
R Practice I
6.1
R Practice I
6.1.1
Learning Objectives
6.1.2
Introduction
6.1.3
High level steps
7
Data Modeling
7.0.1
Learning Objectives
7.0.2
Introduction
7.0.3
Recognizing untidy data
7.0.4
Using normalized data
7.0.5
Data modeling exercise
7.0.6
Resources
8
Cleaning and Manipulating Data
8.1
Data Cleaning and Manipulation
8.1.1
Learning Objectives
8.1.2
Introduction
8.1.3
Data Cleaning Basics
8.1.4
Read in survey data
8.1.5
Joins
9
R Practice II
9.1
R Practice II
9.1.1
Learning Objectives
9.1.2
Introduction
9.1.3
High level steps
10
Ethical Data Collection
10.1
Introduction
10.2
Introduction to ELOKA
10.3
Working With Arctic Communities
10.4
Indigenous Data Governance and Sovereignty
10.5
CARE Principles
10.5.1
Discussion Questions:
10.5.2
Examples from ELOKA
10.5.3
Final Questions
10.5.4
Data Ethics Resources
11
Data Visualization I
11.1
Publication Graphics
11.1.1
Learning Objectives
11.1.2
Overview
11.1.3
Static figures using
ggplot2
11.1.4
Resources
12
Reproducible Survey Workflows
12.1
Reproducible Survey Workflows
12.1.1
Learning Objectives
12.1.2
Introduction
12.1.3
Building workflows using Qualtrics
12.1.4
Other survey tools
13
Data Visualization II
14
Metadata Best Practices and Data Publishing
14.1
Best Practices: Data and Metadata
14.1.1
Learning Objectives
14.1.2
Best Practices: Overview
14.1.3
Data Identifiers
14.1.4
Data Citation
14.1.5
Provanance & Preserving Computational Workflows
14.2
Data Documentation and Publishing
14.2.1
Learning Objectives
14.2.2
Data sharing and preservation
14.2.3
Data repositories: built for data (and code)
14.2.4
Metadata
14.2.5
Structure of a data package
14.2.6
DataONE Federation
14.2.7
Publishing data from the web
15
Data Portals
15.1
What is a Portal?
15.2
Portal Uses
15.3
Portal Features
Flexible Content Creation
Curated Collections of Data
Customized Search Capabilities
Metrics, Metrics, Metrics
15.4
Enhancing Access to Social Science Research Data
15.5
Relationship between Arctic Data Center portals and DataONE
15.6
Creating Portals
Getting Started with Portals
Portal Settings Page
Adding Data to Portals
Building Search Options for Your Audience
Data Package Metrics
Creating Unique Freeform Pages
Saving and Editing Portals
15.7
How to Publish Portals
15.8
Sharing Portals
15.9
Tutorial Videos
15.10
Acknowledgements
16
Text Analaysis
16.1
Extracting Data for Text Analysis
16.1.1
Learning Objectives
16.1.2
Introduction
16.2
Unstructured Text
16.3
Sentiment Analysis
16.4
Summary
17
Provenance and Reproducibility
17.0.1
Learning Objectives
17.0.2
Data Citation and Transitive Credit
17.0.3
Reproducible Papers with
rrtools
17.0.4
Reproducible Papers with
rticles
The 5th Generation of Reproducible Papers
18
Appendix
18.1
Learning Objectives
18.2
Introduction to git
18.2.1
A Motivating Example
Version control and Collaboration using Git and GitHub
Let’s look at a GitHub repository
The Git lifecycle
18.3
Create a remote repository on GitHub
Setup
Challenge
18.4
Working locally with Git via RStudio
Setup
Challenge
Aside
Collaboration and conflict free workflows
18.5
Setting up git on an existing project
Challenge
Challenge
18.6
Go Further
18.7
Learning Objectives
18.8
Introduction
18.9
Collaborating with a trusted colleague
without conflicts
Setup
18.9.1
Step 1: Collaborator clone
18.9.2
Step 2: Collaborator Edits
18.9.3
Step 3: Collaborator commit and push
18.9.4
Step 4: Owner pull
18.9.5
Step 5: Owner edits, commit, and push
18.9.6
Step 6: Collaborator pull
Challenge
18.10
Merge conflicts
18.11
How to resolve a conflict
Abort, abort, abort…
Checkout
Pull and edit the file
18.11.1
Producing and resolving merge conflicts
Merge Conflict Challenge
18.12
Workflows to avoid merge conflicts
18.13
Introduction
18.14
Learning Outcomes
18.15
Lesson
18.16
Overview of Regular Expressions
18.17
Finish out our example together
18.18
Common R functions that use regular expressions
18.18.1
Another example
18.19
Summary
18.20
More
18.21
Resources
18.22
Appendicies
Published with bookdown
Fundamentals in Data Management for Qualitative and Quantitative Arctic Research
13
Data Visualization II
Return
to data visualization with ggplot