11 Group Project: Data Processing
11.1 Setup
In your fork of the scalable-computing-examples
repository, open the Jupyter notebook in the group-project
directory called session-11.ipynb
. This workbook will serve as a skeleton for you to work in. It will load in all the libraries you need, including a few helper functions we wrote for the course, show an example for how to use the method on one file, and then lays out blocks for you and your group to fill in with code that will run that method in parallel.
In your small groups, work together to write the solution, but everyone should aim to have a working solution on their own fork of the repository. In other words, everyone should type out the solution themselves as part of the group effort. Writing the code out yourself (even if others are contributing to the content) is a great way to get “mileage” as you develop these skills.
Only one person in the group should run the parallel code.
11.2 Rasterizing GeoPackages
In the last lession, we staged the input files into GeoPackages. Now we will import those .gpkg
files and write each as a raster (.tif
). The resulting rasters will have 2 bands, one for each statistic we calculate based on the vector geometries in each pixel.
We need to create the highest zoom level rasters before we create the lower zoom levels in the next lesson. This brings us one step closer to visualizing the ice wedge polygons on a basemap, with the ability to zoom in and out!
11.2.1 Packages, Libraries, and Modules
- os 3.11.2
- parsl 2023.3.20
- pdgstaging
- package developed by Permafrost Discovery Gateway software developer and designer Robyn Thiessen-Bock
- pdgraster
- package developed by Permafrost Discovery Gateway software developer and designer Robyn Thiessen-Bock
- geopandas0.11
- random
- matplotlib 3.5