EVENTS
Tools and practices for collaborative, reproducible data science
Dates: May 8, 2020
Location: Remote
This module is an introduction to the data science support NCEAS is providing to LTER and SNAPP working groups followed by a discussion on best practices about data management in a distributed team setup. Participants will have the opportunity to brainstorm on their data and computing needs. In the second part of the workshop, an introduction to the use of NCEAS analytical server and the concept of collaborative coding as a distributed team will be demonstrated to empower participants to develop their analytical workflows in a remote setup.
Data science support (1.5 hours)
Goal: Introduction to data science support and discuss data and analytical needs
Format: presentation and discussion;
Audience: Whole Working group
- Data Science support available to working groups (15min + 15min)
- Presentation of support and goals
- Data preservation requirements
- Data management tips (30 min)
- How to track your data sources
- How to best organize your data
- Discussion about data and computational needs (30min; as breakout group per WG)
Reproducible analytical workflows for distributed teams (1.5 hours)
Goal: Empower working groups to collaboratively develop analytical workflows
Format: Demo + questions
Audience: Analysts + interested participants
- Collaborative and reproducible workflows:
- Workflows to collaborate on a remote server
- How to code collaboratively using git and GitHub
- GitHub Web interface
- RStudio projects and GitHub
- Instructions on how to get an account on NCEAS analytical server
Instructors
- Julien Brun
- Julie Lowndes
- Carrie Kappel