Geographic Information Systems Asked on November 12, 2021
I am trying to extract zonal statistics for different classes in R.
I have a raster with two classes (0,1) and need the area (or percentage) of each class under the area of each polygon. I tried exactextractr::exact_extract
but I don’t manage to extract a frequency of each class. I can get the sum of class 1, but that wont tell me the proportions.
You can write a simple function using prop.table
and table
to return proportions of multiple classes. The catch is that you have to know what all of the classes are before hand so you can fix the number of expected classes.
Here is an example of what is going on.
Here we set our "known" classes and then set up a loop that randomly samples a vector of 1:10 (some values may be missing in a given iteration). We can take the know classes and create an empty factor in x and then calculate our class proportions. If a value is missing then the resulting freq is 0.
classes <- 1:10
p <- list()
for(i in 1:10) {
x <- sample(1:10, 10, replace=TRUE)
p[[i]] <- as.data.frame(prop.table(table(factor(x, levels = classes))))
}
p
Now we can expand this idea to zonal statistics using exact_extract
.
Add libraries and create some example data
library(raster)
library(sp)
library(sf)
library(rgeos)
library(exactextractr)
r <- raster(nrows=180, ncols=360, xmn=571823.6, xmx=616763.6, ymn=4423540,
ymx=4453690, resolution=270, crs = CRS("+proj=utm +zone=12 +datum=NAD83
+units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"))
r[] <- rpois(ncell(r), lambda=1)
x <- gBuffer(sampleRandom(r, 10, na.rm = TRUE, sp = TRUE),
byid = TRUE, width = 1000)
x@data <- data.frame(x@data, ID=paste0("poly", 1:nrow(x)))
plot(r)
plot(x, add=TRUE)
Now we extract the data and use lapply
to apply a function to the resulting list object. We create the known classes by using unique on the raster object. Because you have to read the raster into memory, this could be a real processing bottleneck though.
( e <- exact_extract(r, as(x, "sf")) )
classes <- sort(unique(r[]))
cp <- lapply(e, FUN=function(x) { as.data.frame(prop.table(table(factor(x[,1],
levels = classes))))} )
names(cp) <- x$ID
cp
You can perform some fancy data wrangling to get a data.frame that relates back to your polygons using a simple for loop with transpose. I set up an empty data.frame first so I can populate it using a simple assignment.
props <- data.frame(matrix(vector(), length(cp), length(classes)+1,
dimnames=list(c(), c("ID", paste0("class_",classes)))))
props$ID <- names(cp)
for(i in 1:length(cp)){ props[i,][2:ncol(props)] <- t(cp[[i]][,2]) }
props
Answered by Jeffrey Evans on November 12, 2021
You can get the frequency of a single class by passing a custom summary function to exact_extract
. For example, to get the fraction of pixels that have a value of 1
, you could run:
exact_extract(rast, polys, function(value, fraction) {
sum(fraction[value == 1]) / sum(fraction)
})
If you have an arbitrary number of classes, here's a solution that will provide the frequencies of each class in each polygon. It does not require knowing the classes in advance, nor does it require loading all intersected pixels into memory.
library(dplyr)
library(tidyr)
freqs <- exact_extract(r, g, function(value, coverage_fraction) {
data.frame(value = value,
frac = coverage_fraction / sum(coverage_fraction)) %>%
group_by(value) %>%
summarize(freq = sum(frac), .groups = 'drop') %>%
pivot_wider(names_from = 'value',
names_prefix = 'freq_',
values_from = 'freq')
}) %>%
mutate(across(starts_with('freq'), replace_na, 0))
Basically, we provide a function to exact_extract
that returns a one-row data frame for each polygon, with a column containing the frequency of each class found in that polygon. Doing this with a callback (i.e., specifying the fun
argument) is important, because otherwise R must store every pixel that intersects every polygon in memory at the same time. With the callback, these pixels are reduced to a frequency table as each polygon is processed. Internally, exact_extract
uses dplyr::bind_rows
to merge the data frames for each polygon, which handles the fact that not all classes are present in each data frame. However, it fills in NA
for the frequency of missing classes, so we use a final mutate
call to replace these with zero.
Answered by dbaston on November 12, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP