TransWikia.com

CSV containing GeoJSON format geometries to DataFrame

Geographic Information Systems Asked by Basile on February 17, 2021

I have a CSV file I open as a DataFrame.
Among the columns is one named geom 1 :

df.loc[:, 'geom 1']
>>>
0       {'type': 'Polygon', 'coordinates': [[[2.826589...
1       {'type': 'Polygon', 'coordinates': [[[2.225689...
2       {'type': 'Polygon', 'coordinates': [[[2.225689...
3       {'type': 'MultiPolygon', 'coordinates': [[[[5....
4       {'type': 'Polygon', 'coordinates': [[[3.933055...
                              ...                        
2998                                                 None
2999                                                  NaN
3000    {'type': 'Polygon', 'coordinates': [[[5.014937...
3001    {'type': 'Polygon', 'coordinates': [[[4.912995...
3002    {'type': 'Polygon', 'coordinates': [[[4.739631...
Name: geom 1, Length: 3003, dtype: object

#Conditionning the dataframe :
df.replace('None', np.nan, inplace=True)

I would like to convert these strings to geometries in a GeoDataFrame.

After a few resarches I did not find a working solution :

TypeError: Input geometry column must contain valid geometry objects :

df.loc[:, 'geom 1'] = df.loc[:, 'geom 1'].apply(wkt.loads) 
>>> ParseException: Unknown type: '{'TYPE':'
WKTReadingError: Could not create geometry because of errors while reading input.

Convert GeoJSON to GeoPandas GeoDataframe :

df.replace('None', np.nan, inplace=True)
geom = [shape(i) for i in df.loc[:, 'geom 1'].dropna()]
>>> AttributeError: 'str' object has no attribute 'get'

What I succeded was to sent the DataFrame to PostGIS, convert the type of the column geom 1 from text to geometry and use the function ST_FromGeoJson to access the geometry.
Since I have many columns geom I want to do this under python in order to merge all the columns geom into a single one named all_geoms.

One Answer

The shape function from shapely needs a dictionary, so you need to parse the strings to dictionaries before calling shape. If you have these imports

import geopandas as gpd
import json
import pandas as pd
from shapely.geometry import shape

and you define this function

def parse_geom(geom_str):
    try:
        return shape(json.loads(geom_str))
    except (TypeError, AttributeError):  # Handle NaN and empty strings
        return None

Then this should give you a GeoDataframe

df["geom 1"] = df["geom 1"].apply(parse_geom)
gdf = gpd.GeoDataFrame(df, geometry="geom 1")

Worked example:

import geopandas as gpd
import pandas as pd
import json
from shapely.geometry import shape

def parse_geom(geom_str):
    try:
        return shape(json.loads(geom_str))
    except (TypeError, AttributeError):  # Handle NaN and empty strings
        return None

df = pd.read_csv('path/to/your.csv')
df["geom 1"] = df["geom 1"].apply(parse_geom)
gdf = gpd.GeoDataFrame(df, geometry="geom 1")
print(gdf.head())

Correct answer by Dataform on February 17, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP