16 Land cover statistics

About the data

In this lesson we will use two datasets.

The first one is GAP/LANDFIRE National Terrestrial Ecosystems data for 2011 [1], from the US Geological Survey (USGS). This is a categorical raster with a 30 m x 30 m pixel resolution representing highly thematically detailed land cover map of the U.S. We will access this data through the Microsoft Planetary Computer (MPC) data catalog. The class names and corresponding codes have been saved to a separete CSV to simplify access in this lesson. Further information on how to access the classes directly from the data are available in the MPC catalog.

The second dataset is a shapefile with the perimeters for 2017 California fires. This data was extracted from the CALFIRE’s Historical Wildland Fire Perimeters.

Fire perimeter preparation

Let’s start by importing the necessary libraries:

import os

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import geopandas as gpd
import rioxarray as rioxr
from shapely import box

import pystac_client
import planetary_computer

from IPython.display import Image 

Then import and select the Thomas Fire within the fire perimeters data:

fire_perimeters = gpd.read_file(os.path.join('data',
                                             'California_Fire_Perimeters_2017',
                                             'California_Fire_Perimeters_2017.shp')
                                             )
thomas_fire = fire_perimeters[fire_perimeters['FIRE_NAME']=='THOMAS']

# Examin fire perimeter data
thomas_fire.crs
thomas_fire.plot()

Explore raster

Next, we can go ahead an open the raster:

# Access raster data from item
lulc = rioxr.open_rasterio(item.assets['data'].href)
lulc
<xarray.DataArray (band: 1, y: 10000, x: 10000)>
[100000000 values with dtype=uint16]
Coordinates:
  * band         (band) int64 1
  * x            (x) float64 -2.361e+06 -2.361e+06 ... -2.061e+06 -2.061e+06
  * y            (y) float64 1.762e+06 1.762e+06 ... 1.462e+06 1.462e+06
    spatial_ref  int64 0
Attributes:
    AREA_OR_POINT:           Area
    TIFFTAG_RESOLUTIONUNIT:  2 (pixels/inch)
    TIFFTAG_SOFTWARE:        ERDAS IMAGINE
    TIFFTAG_XRESOLUTION:     1
    TIFFTAG_YRESOLUTION:     1
    scale_factor:            1.0
    add_offset:              0.0

Notice that band is a dimension of length 1. We can go ahead and “squeeze” the raster to simplify it:

# Remove length 1 dimension (band)
lulc = lulc.squeeze().drop_vars('band')
print("Sizes of dimensions:", dict(lulc.sizes))
Sizes of dimensions: {'y': 10000, 'x': 10000}

Next, let’s look at how the raster is locatd with respect to the Thomas Fire perimeter and the CA state boundary:

# Create GeoDataFrame from raster bounding box
lulc_bbox = gpd.GeoDataFrame(geometry = [box(*lulc.rio.bounds())],
                             crs = lulc.rio.crs)

ca = gpd.read_file(os.path.join('data',
                                'ca_state_boundary',   
                                'ca_state_boundary.shp'))

# ------------------------------------------------------------------
# Plot raster boundary, fire perimeter, and CA boundary
fig, ax = plt.subplots()
ca.plot(ax=ax, color='white', edgecolor ='black')

# Reproject lulc_bbox and fire perimeter to match CA crs
lulc_bbox.to_crs(ca.crs).plot(ax=ax, alpha=0.3)  
thomas_fire.to_crs(ca.crs).plot(ax=ax, color='red')

plt.show()

We can see the raster covers a big area relative to the fire perimeter. Since we want to calculate the land coverage statistics within the fire perimeter, we will have to clip the raster to this area.

Clip raster to geometry

In our first lesson about rasters we saw how to clip a raster to a rectangular region. In our case, we want to clip the raster exactly to the fire perimeter. Clipping can be a costly operation for such a big raster relative to a detailed geometry. So we will perform the clipping in two steps:

  1. Clip the raster using the fire perimeter bounding box using rio.clip_box() and then
  2. Clip the simplified raster to the fire perimeter using rio.clip().
# Match CRSs and verify update
thomas_fire_match = thomas_fire.to_crs(lulc.rio.crs)
assert thomas_fire_match.crs == lulc.rio.crs

# Clip large raster to detailed geometry in two steps
lulc_step1 = lulc.rio.clip_box(*thomas_fire_match.total_bounds)
lulc_step2 = lulc_step1.rio.clip(thomas_fire_match.geometry)  # Produces RuntimeWarning

# ------------------------------------------------------
fig, ax = plt.subplots(1, 2, figsize=(8, 4))

# Plot the first clipped raster
lulc_step1.plot(ax=ax[0])
ax[0].set_title("Step 1: Clip to Bounding Box")
ax[0].axis("off")

# Plot the second clipped raster
lulc_step2.plot(ax=ax[1])
ax[1].set_title("Step 2: Clip to Detailed Geometry")
ax[1].axis("off")

plt.show()
/Users/galaz-garcia/opt/anaconda3/envs/mpc-env/lib/python3.11/site-packages/xarray/core/duck_array_ops.py:201: RuntimeWarning: invalid value encountered in cast
  return data.astype(dtype, **kwargs)

Raster no-data values

Notice a warning appeared when we clipped the raster. After some investigation, we will find that this RuntimeWarning occurs because the rio.clip() function tries to replace values outside the fire perimeter geometry with np.nan. However, as we previously saw, our raster’s data type is uint16 (16 bits unsigned integer). The cast issue appears since np.nan is a float (decimal number) and it cannot be casted as a uint16. To make sure the clipping operator fills in any pixels with the adequate no-data value, let’s manually set it:

print('Original no-data value: ', lulc.rio.nodata)

# Update raster's no-data value
lulc = lulc.rio.write_nodata(0)
print('Updated no-data value: ', lulc.rio.nodata)
Original no-data value:  None
Updated no-data value:  0

This way, the rio.clip() function will know what values to assign to pixels outside the fire perimeter. Let’s try clipping again, this time using method chaining:

lulc_clip = (lulc.rio.clip_box(*thomas_fire_match.total_bounds)
                 .rio.clip(thomas_fire_match.geometry)
                 )

# Examine results
lulc_clip.plot()                                  

Notice no warning came up during the clipping!

Always pay attention to warnings!

Warnings are your program’s way of saying, “Something might go wrong here, take a look!” They may indicate silent failures, package compatibility issues, or potential bugs, amont other issues. Do not ignore warnings! Addressing warnings is part of writing clean, maintainable code and reflects a professional approach.

These are some steps to handling warnings effectively:

  • Read and understand them: Don’t dismiss warnings without understanding their cause.
  • Fix or address them: Modify your code to resolve the warning if possible.
  • Suppress only when necessary: Use tools to suppress warnings only when you’re sure they are irrelevant or benign.

Land cover statistics

In the rest of this lesson we will compute land cover statistics within the Thomas Fire perimeter. The following exercises will guide you through this process:

Exercises
  1. Use the numpy function np.unique() to get the number of pixels per class in lulc_clip. HINT: check the np.unique() documentation to see what the return_counts parameter does and read the last example.

  2. Create a data frame pix_counts with two columns: column one must be the code numbers for the pixels in lulc_clip and column two must be the number of pixels corresponding to each code. HINT: check our class notes on pandas.DataFrames

  1. Use the labels data frame to add the class names to the codes in the pix_counts data frame. Store the resulting data frame as classes.
  1. What area within the fire perimeter (in km^2) was estimated to be developed? HINT: what is the raster’s resolution?
  1. Store the total number of pixels within the fire perimeter as a variable total_pixels.

  2. Add the percentage of area covered by each class as a new column percentage to the classes data frame. Sort the data frame by percentage coverage in descending order.

  1. Create a horizontal bar plot showing the classes with more than 1% land cover in decreasing order. For example:

References

[1]
A. Davidson and A. McKerrow, GAP/LANDFIRE National Terrestrial Ecosystems 2011.” U.S. Geological Survey, 2016. doi: 10.5066/F7ZS2TM0. Available: https://www.sciencebase.gov/catalog/item/573cc51be4b0dae0d5e4b0c5. [Accessed: Nov. 26, 2024]