Geographic Information Systems Asked on November 26, 2021
So, I have a dataframe like this,
import numpy as np
import pandas as pd
import descartes
from shapely.geometry import Point, Polygon
import geopandas as gpd
import matplotlib.pyplot as plt
df = pd.DataFrame({'Address':['280 Broadway','1 Liberty Island','141 John Street'],
'Latitude':[ 40.71,40.69,40.71],
'Longitude':[-74.01,-74.05,-74.00]
})
%matplotlib inline
geometry = [Point(xy) for xy in zip( df["Longitude"],df["Latitude"])]
crs = {'init':'epsg:4326'}
df = gpd.GeoDataFrame(df,
crs=crs,
geometry=geometry)
df.head()
I converted the lat and lon to geometry points and I am trying to find all possible closest points for each address using the geometrical points. For example, all possible closest points adjacent to 280 Broadway which lies next to each other for one block.There could be more than one point if the points are adjacent to each other containing in a polygon shape.
This was my approach but didn’t really get what I wanted,
df.insert(4, 'nearest_geometry', None)
from shapely.geometry import Point, MultiPoint
from shapely.ops import nearest_points
for index, row in df.iterrows():
point = row.geometry
multipoint = df.drop(index, axis=0).geometry.unary_union
queried_geom, nearest_geom = nearest_points(point, multipoint)
df.loc[index, 'nearest_geometry'] = nearest_geom
Desired Output:
Address Lat Lon geometry nearest_points
280 Broadway 40.71 -74.01 POINT (-74.01000 40.71000) POINT(NEAREST GEOMETRIC POINT)
If you are working with latitudes and longitudes, I'd suggest you work with the haversine formula, which gives the great-circle distance between two points on a sphere. To return the k nearest neighbors, you could go for something like this:
import numpy as np
from sklearn.neighbors import BallTree
# the formula requires rad instead of degree
dataframe[["lat_rad", "lon_rad"]] = np.deg2rad(dataframe[["Latitude", "Longitude"]])
ball_tree = BallTree(dataframe[["lat_rad", "lon_rad"]], metric="haversine")
neighbors = ball_tree.query(
dataframe[["lat_rad", "lon_rad"]],
k=(
k + 1
), # k + 1 because we remove the address itself later, hence we need k - 1 = k_desired
return_distance=False, # choose whether you also want to return the distance
sort_results=True,
)
# remove the address/point itself from the array because it itself is its nearest neighbour
neighbors = neighbors[:, 1:]
# select the nearest addresses by position index
dataframe["nearest_addresses"] = [
dataframe["Address"].iloc[n].to_list() for n in neighbors
]
dataframe.explode("nearest_addresses")[["Address", "nearest_addresses"]]
with dataframe being a pandas DataFrame.
Answered by 00schneider on November 26, 2021
Here is a method using scipy.spatials KDTree which is used to find the list k nearest neighbors. I have set k=2 since the nearest neighbor is itself. We get the result neighs which is an array of indexes for example neighs[0] = [0,j] where j is the index of it's nearest neighbor of the point at index 0. I then slice this array so it's just the nearest neighbor. Then I access the points and add a column to the df.
from scipy import spatial
# get list of points
points = df['geometry'].apply(
lambda g:[g.x,g.y]).tolist()
#spatially organising the points on a tree for quick nearest neighbors calc
kdtree = spatial.KDTree(points)
#calculates the nearest neighbors of each point
_ , neighs = kdtree.query(centroids, k=2)
# remove itself as neighbor
neighs = neighs[:,1]
# add column to df
df['nearest_points'] = df.iloc[neighs, 3].tolist()
Answered by Dom McEwen on November 26, 2021
I don't know about geopandas specifically, but I would use shapely
's STRTree
for this task. It has a nearest
method:
from shapely.geometry import Point
from shapely.strtree import STRtree
points = [
Point(1, 1),
Point(2, 2),
Point(3, 3)
]
tree = STRtree(points)
print(tree.nearest(Point(0, 0)).wkt)
print(tree.nearest(Point(5, 5)).wkt)
This will yield
POINT (1 1)
POINT (3 3)
Answered by carusot42 on November 26, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP