Geographic Information Systems Asked by Neal Barsch on April 9, 2021
I’m wondering if I have maximized the speed at which a mean of an area buffered around a point in a raster can be extracted.
Fastest way I have ever found to extract a raster, with the pre-cropping suggestion by @dbaston:
If you have the velox raster already (even if you have to buffer the shape dynamically), this is lightning:
test7_LIGHTNING <- system.time(mclapply(seq_along(x1), function(x){
q <- vras$crop(poly);vras$extract(poly, fun=function(t) mean(t,na.rm=T))
}))
> test7_LIGHTNING
user system elapsed
0.001 0.005 0.355
These are still quick if you need to dynamically load velox raster (simulating loading different rasters across a set, but here loading the same raster repetitively from sample reproducible data below):
test8 <- system.time(mclapply(seq_along(x1), function(x){ ras<-velox("testras_so.tif");ras$crop(poly);ras$extract(poly, fun=function(t) mean(t,na.rm=T)) }))
test9 <- system.time(mclapply(seq_along(x1), function(x){ ras<-velox("testras_so.tif");ras$crop(st_buffer(st_sfc(st_point(c(x1[x],y1[x]), dim="XY"),crs=32651),200000));ras$extract(poly, fun=function(t) mean(t,na.rm=T)) }))
> test8
user system elapsed
0.011 0.016 4.450
> test9
user system elapsed
0.006 0.012 4.333
ORIGINAL QUESTION:
Can performance be improved any further on these LOCALLY? I use parallel
mclapply
already, and I know I could get further gains by setting up and running this on a cluster (use a cluster or get more cpu’s is not the answer I’m looking for).
Replicate some data:
library(raster)
library(parallel)
library(truncnorm)
library(gdalUtils)
library(velox)
library(sf)
ras <- raster(ncol=1000, nrow=1000, xmn=2001476, xmx=11519096, ymn=9087279, ymx=17080719)
ras[]=rtruncnorm(n=ncell(ras),a=0, b=10, mean=5, sd=2)
crs(ras) <- "+proj=utm +zone=51 ellps=WGS84"
writeRaster(ras,"testras_so.tif", format="GTiff")
gdalbuildvrt(gdalfile = "testras_so.tif",
output.vrt = "testvrt_so.vrt")
x1 <- runif(100,2001476,11519096)
y1 <- runif(100, 9087279,17080719)
poly <- st_buffer(st_sfc(st_point(c(x1[1],y1[1]), dim="XY"),crs=32651),200000)
vras <- velox("testvrt_so.vrt")
###########
Tests (test1: if have poly and velox raster, test2: if have to generate buffer but have velox raster, test3: if have to generate velox from VR (simulating having different rasters) but having the buffer, test4: have to generate both (from VR):test5: generate velox from tif have buffer, test6: generate both (tif version).
#Test time if have poly and velox raster
test1 <- system.time(mclapply(seq_along(x1), function(x){
vras$extract(poly, fun=function(t) mean(t,na.rm=T))
}))
#Test time if have to generate buffer but have velox raster
test2 <- system.time(mclapply(seq_along(x1), function(x){
vras$extract(st_buffer(st_sfc(st_point(c(x1[x],y1[x]), dim="XY"),crs=32651),200000), fun=function(t) mean(t,na.rm=T))
}))
#Test time if have to generate velox from VR (simulating having different rasters) but having the buffer
test3 <- system.time(mclapply(seq_along(x1), function(x){
velox("testvrt_so.vrt")$extract(poly, fun=function(t) mean(t,na.rm=T))
}))
#Test time if have to generate velox from VR AND generate buffer (simulating a list of rasters with different buffers each)
test4 <- system.time(mclapply(seq_along(x1), function(x){
velox("testvrt_so.vrt")$extract(st_buffer(st_sfc(st_point(c(x1[x],y1[x]), dim="XY"),crs=32651),200000), fun=function(t) mean(t,na.rm=T))
}))
#Test time if have to generate velox from TIF (simulating having different rasters) but having the buffer
test5 <- system.time(mclapply(seq_along(x1), function(x){
velox("testras_so.tif")$extract(poly, fun=function(t) mean(t,na.rm=T))
}))
#Test time if have to generate velox from TIF AND generate buffer (simulating a list of rasters with different buffers each)
test6 <- system.time(mclapply(seq_along(x1), function(x){
velox("testras_so.tif")$extract(st_buffer(st_sfc(st_point(c(x1[x],y1[x]), dim="XY"),crs=32651),200000), fun=function(t) mean(t,na.rm=T))
}))
My results (yours will vary with cores due to mclapply running parallel):
> test1
user system elapsed
0.007 0.022 3.501
> test2
user system elapsed
0.008 0.025 3.851
> test3
user system elapsed
0.018 0.036 10.546
> test4
user system elapsed
0.020 0.042 10.754
> test5
user system elapsed
0.015 0.034 9.143
> test6
user system elapsed
0.015 0.033 9.364
Can anybody make any suggestions to make this faster or have I maxed local speed here?
I get much faster results with velox
if I crop the raster before running extract
, e.g.:
r <- velox("testras_so.tif")
r$crop(poly)
r$extract(poly)
I've also been working on a package with an optimized extract
function that may be of interest:
library(exactextractr)
exact_extract(ras, poly) # get a matrix with weights and values,
# as with raster::extract(, weights=TRUE)
exact_extract(ras, poly, weighted.mean) # pass weights and values to an R function
exact_extract(ras, poly, 'mean') # use a built-in stat
For this case, runtimes look comparable to velox
but take into account partially covered cells (comparable to weights=TRUE
with raster::extract
).
library(microbenchmark)
v <- velox(ras)
microbenchmark(
v$extract(poly),
{q <- v$copy(); q$crop(poly); q$extract(poly)},
exact_extract(ras, poly)
)
# Unit: milliseconds
# expr min lq mean median uq max neval
# v$extract(poly) 38.937180 39.749301 41.703686 40.247412 41.217893 56.372611 100
# { q <- v$copy()... 2.510894 2.639588 2.769510 2.725196 2.844109 4.850091 100
# exact_extract(ras, poly) 1.818466 2.039748 3.193071 2.255309 2.382927 35.236612 100
Correct answer by dbaston on April 9, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP