Geographic Information Systems Asked by Gregor D on March 31, 2021
I have written a little script to raster to GeoPandas dataframe:
with rasterio.open("INPUT FILE", 'r') as raster:
df = pd.DataFrame()
puntos_calados_combinados=[]
puntos=[]
xmin,ymax=raster.transform* (0, 0)
arr = raster.read(1) # read all raster values
for row in range(0,raster.width ):
for col in range(0,raster.height):
x=xmin+ (0.5*raster.transform[0] + row*raster.transform[0])
y=ymax+ (0.5*raster.transform[4] + col*raster.transform[4])
valor= arr[col,row]
if arr[col,row]>0:
x_y=[x,y]
puntos_calados_combinados.append(x_y)
But it’s extremely slow, any idea to improve it?
Rather than looping through each pixel, you could try creating numpy arrays of X & Y coordinates then dumping those (plus the array of the raster values) into a DataFrame, then converting to a GeoDataFrame.
Below is a runnable example (Note, you do not want to do this with large rasters, you can run out of memory...):
import geopandas as gpd
import numpy as np
import pandas as pd
import rasterio as rio
with rio.Env():
with rio.open('https://github.com/OSGeo/gdal/raw/master/autotest/gdrivers/data/float32.tif') as src:
crs = src.crs
# create 1D coordinate arrays (coordinates of the pixel center)
xmin, ymax = np.around(src.xy(0.00, 0.00), 9) # src.xy(0, 0)
xmax, ymin = np.around(src.xy(src.height-1, src.width-1), 9) # src.xy(src.width-1, src.height-1)
x = np.linspace(xmin, xmax, src.width)
y = np.linspace(ymax, ymin, src.height) # max -> min so coords are top -> bottom
# create 2D arrays
xs, ys = np.meshgrid(x, y)
zs = src.read(1)
# Apply NoData mask
mask = src.read_masks(1) > 0
xs, ys, zs = xs[mask], ys[mask], zs[mask]
data = {"X": pd.Series(xs.ravel()),
"Y": pd.Series(ys.ravel()),
"Z": pd.Series(zs.ravel())}
df = pd.DataFrame(data=data)
geometry = gpd.points_from_xy(df.X, df.Y)
gdf = gpd.GeoDataFrame(df, crs=crs, geometry=geometry)
print(gdf.head())
Output:
X Y Z geometry
0 440750.0 3751290.0 107.0 POINT (440750.000 3751290.000)
1 440810.0 3751290.0 123.0 POINT (440810.000 3751290.000)
2 440870.0 3751290.0 132.0 POINT (440870.000 3751290.000)
3 440930.0 3751290.0 115.0 POINT (440930.000 3751290.000)
4 440990.0 3751290.0 132.0 POINT (440990.000 3751290.000)
Correct answer by user2856 on March 31, 2021
Based on: https://gis.stackexchange.com/a/358057/144357
rds = rioxarray.open_rasterio("input.tif")
rds.name = "data"
df = rds.squeeze().to_dataframe().reset_index()
geometry = gpd.points_from_xy(df.x, df.y)
gdf = gpd.GeoDataFrame(df, crs=rds.rio.crs, geometry=geometry)
Answered by snowman2 on March 31, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP