Final project

The project consists of two tasks:

  • Task 1: add a blog post to your personal website based on one of your previous assignments and discussion sections. This task is 35% of the final project grade.
  • Task 2: create a presentation-ready GitHub repository featuring geospatial analysis in Python. This task is 65% of the final project grade.

Submission instructions

Due dates:

  • Task 1 (blog post) is due by 11:59 pm on Wednesday, December 4.

  • Task 2 (GitHub repo) is due by by 11:59 pm on Saturday, December 7.

Feedback for both tasks will be given to students by Tuesday December 10 and the final resubmission for both tasks is Friday, December 13 (last day of the quarter). Both a submission and a revised submission addressing all the feedback from the first revision will be needed to get a final grade for these two tasks. An assignment without a resubmission will get a 0 grade.

All tasks for this assignment should be submitted via Gradescope. File formatting and uploading:

If you have any questions about assignment logistics, please reach out to the TA or instructor by 5 pm on the day before the due dates.

Task 1: Thomas Fire analysis blog post and accompanying analyses

In this task, you will transform your presentation-ready repository from Homework 4 into a polished blog post for your personal website and compile and streamline all its supporting notebooks.

Purpose

Your blog post should showcase two out of the three data analysis exercises we did related to the Thomas Fire:

  • Land cover statistics (November 27 exercises)
  • Air Quality Index analysis (Hwk 2 - Task 3)
  • False color image analysis (Hwk 4 - Task 2)

If you decide to include the land cover statistics (greatly encouraged!), you should start you analysis from the terrestrial ecosystems data clipper over the Thomas Fire perimeter. You can find this ready-to-use raster and the terrestrial ecosystem labels at this directory within workbench-1:

/courses/EDS220/data/USGS_National_Terrestrial_Ecosystems_Over_Thomas_Fire_Perimeter/

Your audience is a fellow Python programmer with beginner-to-intermediate geospatial wrangling experience. This is an opportunity to highlight your skills and communicate your analysis in a clear and engaging way!

To help you understand the desired format and text-to-code ratio, review these examples:

Take note of how they balance explanations, code, and visualizations to communicate their analyses effectively.

In addition to the blog post, the notebooks used for the data analysis should be part of your presentation-ready GitHub repository. These notebook contain work you have already completed and probably revised, so the effort should go into cleaning and streamlining them.

Setup

Updated on Dec 1 to include information about how to render the post.

Blog post

A Quarto personal website will be needed to submit the blog post. If you previously enrolled in EDS 296-1F (Data Science Tools for Building Professional Online Portfolios) then you’re all set! If you don’t have a website, this tutorial will smoothly guide you through creating one:

To add your blog post as a Quarto (.qmd) file you can follow this tutorial:

Alternatively, your blog post can be rendered directly from a Jupyter notebook (.ipynb). Converting a Jupyter notebook into a Quarto file and viceversa is really easy using the quarto convert command. The Quarto documentation has more information about rendering Jupyter notebooks using Quarto.

Customizing your website is optional. This is a tutorial to help you add customizations:

Notebooks and repository

In addition to the blog post, in this task you will expand the GitHub repository you created for homework 4 by including the notebook with the other Thomas Fire data analysis into it. Follow the repository structure below:

repo-name  # Consider renaming, e.g., "thomas-fire-analysis"
│   README.md
|   notebooks  # One per analysis, clean and streamlined
|   .gitignore

└── data  # If needed
    │   data-files  

Important notes:

  • Ensure no .DS_Store or .ipynb_checkpoints files are present in your repository.

  • All notebooks should be streamlined and will be checked using the “notebooks organization” rubric!

📋 Don’t forget to read the rubric for this task before you start working on it.

💡 For your first blog post submission, it’s normal to receive feedback and a “needs revision” rating. Implementing the requested changes during resubmission can earn you 100% of the points. However, if no revised blog post is submitted, this task will count as 0%. Revising will be a crucial part of this task!

Guidelines

This task builds on your previous work. Your first step should be to carefully review the feedback you received on your notebooks and Homework 4 repository and make the necessary updates.

💡 If you have questions or need clarification on the feedback, don’t hesitate to ask!

Your post should be streamlined, focusing on the key steps and results. Avoid lengthy print checks and intermediate outputs, favoring concise, well-explained code chunks.

Your post should include (at least) the following:

  • About section:
    • Provide context for the analysis. Introduce the Thomas Fire, its significance, and the relevance of your analyses.
    • Include an image related to the analysis.
  • Highlights of analysis:
    • List of highlights of analysis (3 or 4 highlights). What do you consider to be the most important techniques used or insights from the analysis?
  • Dataset descriptions:
    • Describe the datasets you used and provide proper citations or references.
  • Link to GitHub repository:
    • Include a link to your GitHub repository at the top of the post. The repository should be presentation-ready and contain the full, detailed analysis.
  • Organized data analysis sections:
    • Break the analysis into logical sections.
    • Include only the most relevant code.
    • Aim for a clean, professional layout with properly formatted code snippets and images.
  • Final visualizations:
    • Showcase your main visualizations with clear captions. Follow each visualization with a short description explaining its insights and relevance.

Task 2: Biodiversity Intactness Index change in Phoenix, AZ

In 2021, Maricopa County —home to the Phoenix metropolitan area— was identified as the U.S. county with the most significant increase in developed land since 2001 [1]. This rapid urban sprawl has profound implications for biodiversity and the health of surrounding natural ecosystems.

In this assignment, you will investigate the impacts of urban expansion by analyzing a dataset that captures values for the Biodiversity Intactness Index (BII) [2]. Your task is to examine changes in BII in the Phoenix county subdivision area between 2017 and 2020, shedding light on how urban growth affects biodiversity over time.

About the data

In this task you will use two datasets.

You will work with two datasets for this task:

  1. Biodiversity Intactness Index (BII) Time Series Access the io-biodiversity collection from the Microsoft Planetary Computer STAC catalog. Use the 2017 and 2020 rasters covering the Phoenix subdivision. For the bounding box, use the following coordinates:
[-112.826843, 32.974108, -111.184387, 33.863574]
  1. Phoenix Subdivision Shapefile Download the Phoenix subdivision polygon from the Census County Subdivision shapefiles for Arizona.

To enhance your data exploration, you may use additional shapefiles or rasters to create a map situating the Phoenix subdivision within its broader geographic context.

Setup

First, read all the instructions and rubric for this task. Then, create a new GitHub repository to house the work for this task. Choose an adequate, informative title for your repository (not EDS220-final-project). Your repository should have the following structure:

repo-name
│   README.md
|   notebooks
|   .gitignore

└── data  # If needed
    │   data-files  

In particular, there should be no .DS_Store or .ipynb_checkpoints files in your repository.

📋 Don’t forget to read the rubric for this task before you start working on this task!

Instructions

1. Data analysis

  1. Explore the data and write a brief summary of the information you obtained from the preliminary information.

  2. Create a map showing the Phoenix subdivision within its broader geographic context. You may use any vector or raster datasets to create your map. Be sure to include citations or descriptions for these datasets at the top of your notebook too. You may also want to check out the contextily package to add a base map.

  3. Calculate the percentage of area of the Phoenix subdivision with a BII of at least 0.75 in 2017. Obtain the same calculation for 2020. Before you start coding, take a moment to write step-by-step instructions for yourself about how to get this result. You don’t need to include these in your notebook, but you should have a plan before starting your code.

  • Let x be an xarray.DataArray. We can select all the values greater than n by simply doing x>n.
  • Make sure you are calculating the percentage over the Phoenix area, and not the complete raster extent.
  1. Create a visualization showing the area with BII>=0.75 in 2017 that was lost by 2020. Here’s an example:

  • To find which pixels changed value from 2017 to 2020 think about the following example. Which values in R3 represent areas that had BII>=0.75 in 2017 but not in 2020?

  • You can plot multiple rasters in the same figure. NaN values will be transparent.
  1. Under your BII visualization write a brief description of the results you obtianed in this task.

2. Clean notebook

The target audience for your notebook is a fellow EDS 220 student who is just learning geospatial wrangling using Python.

  • Add enough and appropriate comments to explain your code.
  • First cell in the notebook must be a markdown cell including:
  • At the top of the notebook, include an “About” section with the following subsections:
  • The rest of your notebook should have:

3. Clean repository

Same as the instructions for assignment 4. Update your repository’s README with (at least) the following (based on EDS 296):

Ready to submit your answers? Make sure your submission follows the checklist at the top of the assginment!

References

[1]
Z. Levitt and J. Eng, “Where America’s developed areas are growing: Way off into the horizon’,” The Washington Post, Aug. 2021, Available: https://www.washingtonpost.com/nation/interactive/2021/land-development-urban-growth-maps/. [Accessed: Nov. 22, 2024]
[2]
F. Gassert, J. Mazzarello, and S. Hyde, “Global 100m Projections of Biodiversity Intactness for the years 2017-2020 [Technical Whitepaper].” Aug. 2022. Available: https://ai4edatasetspublicassets.blob.core.windows.net/assets/pdfs/io-biodiversity/Biodiversity_Intactness_whitepaper.pdf