Geographic Information Systems Asked by Danple on August 8, 2020
I’m working with a dataset of islands in which I need to calculate the distance between each island and the nearest landmass with an area greater than that island. I can do this successfully with a for loop, but this takes too long and I want to rewrite the code so that it can eventually work with parallel processing and apply functions.
Here is example code for the for loop:
library(rgdal)
library(tidyverse)
library(sf)
library(spData)
alaska <- alaska
#this dissolves all features to remove internal divisions and
#then converts multi-part polygons to single part. This
#effectively creates a separate feature for each isolated
#geometry (i.e. island)
alaska <- alaska %>%
st_buffer(0) %>%
summarise(area = AREA) %>%
st_cast("POLYGON")
#recalculate area for each island
alaska$area <- alaska %>%
st_area() %>%
units::set_units(value = km^2) %>%
as.numeric()
#spatial object with Alaska landmass still included
alaska_all <- alaska
#spatial object with Alaska landmass removed, leaving only islands
alaska_islands <- alaska[!alaska$area == sort(alaska$area, decreasing = T)[1],]
for (i in 1:nrow(alaska_islands)){
d <- st_distance(alaska_islands[i,],
alaska_all[alaska_all$area > alaska_islands$area[i],],
by_element = T)
alaska_islands$nn[i] <- sort(unique(d))[1]
}
What I’m having trouble figuring out is how to keep the alaska_islands$area[i],]
iteration as part of the function. I’ve tried apply, lapply, and mapply and have had limited success. The most successful thus far has been:
apply(st_distance(alaska_islands, alaska_all[alaska_all$area > alaska_islands$area[i],]), 1, min)
but it seems to ignore the iterator. Some islands return the correct values (i.e. same as from the for loop) while others simply return 0.
The usual trick to turning a loop over i
into an apply
style loop is to apply over i
and write a function that does the job for argument i
. For example:
sapply(1:nrow(alaska_islands),
function(i){
min(
st_distance(
alaska_islands[i,],
alaska_all[alaska_all$area > alaska_islands$area[i],])
)
}
)
gives:
[1] 59657.8767 25262.5830 5546.5902 51010.3289 18465.6949 4590.8588
[7] 5137.2093 5379.8038 3629.5807 7366.8446 190.1415 5194.3717
[13] 4258.3425 7284.6550 18060.4840 43035.8838 7747.8683 28012.9886
[19] 8728.0644 176939.7554 33350.3792 51818.7938 2178.2563 343380.1486
[25] 253544.2871 2653.4898 18492.8536 14964.9450 8390.1484 15273.4045
[31] 250720.1601 7470.6006 8685.0207 3328.7401 11072.0120 16976.9256
[37] 7945.7566 6188.7005 38892.5618 158952.4960 12986.4244 3583.9433
[43] 4802.9676 54793.4105 3607.2833 90103.9748 14677.1710 21871.5170
[49] 564263.9674
which is the same as your $nn
values from the loop.
That said, it might be easier to make his parallel using foreach
from the foreach
package. But if you are happy with the parallel versions of apply
, then this should get you on your way.
Correct answer by Spacedman on August 8, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP