R zonal statistics for classes

Question

I am trying to extract zonal statistics for different classes in R.
I have a raster with two classes (0,1) and need the area (or percentage) of each class under the area of each polygon. I tried  exactextractr::exact_extract  but I don't manage to extract a frequency of each class. I can get the sum of class 1, but that wont tell me the proportions.

Jeffrey Evans · Answer

You can write a simple function using prop.table and table to return proportions of multiple classes. The catch is that you have to know what all of the classes are before hand so you can fix the number of expected classes.
Here is an example of what is going on.
Here we set our "known" classes and then set up a loop that randomly samples a vector of 1:10 (some values may be missing in a given iteration). We can take the know classes and create an empty factor in x and then calculate our class proportions. If a value is missing then the resulting freq is 0.
classes <- 1:10

p <- list()
  for(i in 1:10) {
    x <- sample(1:10, 10, replace=TRUE)
    p[[i]] <- as.data.frame(prop.table(table(factor(x, levels = classes))))
  }
p

Now we can expand this idea to zonal statistics using exact_extract.
Add libraries and create some example data
library(raster)
library(sp)  
library(sf)                                                                          
library(rgeos)
library(exactextractr)

r <- raster(nrows=180, ncols=360, xmn=571823.6, xmx=616763.6, ymn=4423540, 
            ymx=4453690, resolution=270, crs = CRS("+proj=utm +zone=12 +datum=NAD83 
            +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"))

r[] <- rpois(ncell(r), lambda=1)

x <- gBuffer(sampleRandom(r, 10, na.rm = TRUE, sp = TRUE),  
             byid = TRUE, width = 1000) 
x@data <- data.frame(x@data, ID=paste0("poly", 1:nrow(x)))

plot(r)
  plot(x, add=TRUE)

Now we extract the data and use lapply to apply a function to the resulting list object. We create the known classes by using unique on the raster object. Because you have to read the raster into memory, this could be a real processing bottleneck though.
( e <- exact_extract(r, as(x, "sf")) )

classes <- sort(unique(r[]))
cp <- lapply(e, FUN=function(x) { as.data.frame(prop.table(table(factor(x[,1], 
                          levels = classes))))} ) 
names(cp) <- x$ID   
cp

You can perform some fancy data wrangling to get a data.frame that relates back to your polygons using a simple for loop with transpose. I set up an empty data.frame first so I can populate it using a simple assignment.
props <- data.frame(matrix(vector(), length(cp), length(classes)+1,
                    dimnames=list(c(), c("ID", paste0("class_",classes)))))
  props$ID <- names(cp)  
for(i in 1:length(cp)){ props[i,][2:ncol(props)] <- t(cp[[i]][,2]) }
props

dbaston · Answer

You can get the frequency of a single class by passing a custom summary function to exact_extract. For example, to get the fraction of pixels that have a value of 1, you could run:
exact_extract(rast, polys, function(value, fraction) {
     sum(fraction[value == 1]) / sum(fraction)
  })

If you have an arbitrary number of classes, here's a solution that will provide the frequencies of each class in each polygon. It does not require knowing the classes in advance, nor does it require loading all intersected pixels into memory.
library(dplyr)
library(tidyr)

freqs <- exact_extract(r, g, function(value, coverage_fraction) {
  data.frame(value = value,
             frac = coverage_fraction / sum(coverage_fraction)) %>%
    group_by(value) %>%
    summarize(freq = sum(frac), .groups = 'drop') %>%
    pivot_wider(names_from = 'value',
                names_prefix = 'freq_',
                values_from = 'freq')
}) %>% 
  mutate(across(starts_with('freq'), replace_na, 0))

Basically, we provide a function to exact_extract that returns a one-row data frame for each polygon, with a column containing the frequency of each class found in that polygon. Doing this with a callback (i.e., specifying the fun argument) is important, because otherwise R must store every pixel that intersects every polygon in memory at the same time. With the callback, these pixels are reduced to a frequency table as each polygon is processed. Internally, exact_extract uses dplyr::bind_rows to merge the data frames for each polygon, which handles the fact that not all classes are present in each data frame. However, it fills in NA for the frequency of missing classes, so we use a final mutate call to replace these with zero.

R zonal statistics for classes

2 Answers

Add your own answers!

Ask a Question