Geographic Information Systems Asked by Aleksey Bilogur on March 8, 2021
Is it possible to read raw data into a geopandas
GeoDataFrame
, a la a pandas
DataFrame
?
For example, the following works:
import io
import pandas as pd
import requests
data = requests.get("https://data.cityofnewyork.us/api/geospatial/arq3-7z49?method=export&format=GeoJSON")
pd.read_json(io.BytesIO(data.content))
The following does not:
import geopandas as gpd
import io
import requests
data = requests.get("https://data.cityofnewyork.us/api/geospatial/arq3-7z49?method=export&format=GeoJSON")
gpd.read_file(io.BytesIO(data.content))
In other words, is it possible to read geospatial data that’s in memory without saving that data to disk first?
You can pass the json directly to the GeoDataFrame constructor:
import geopandas as gpd
import requests
data = requests.get("https://data.cityofnewyork.us/api/geospatial/arq3-7z49?method=export&format=GeoJSON")
gdf = gpd.GeoDataFrame(data.json())
gdf.head()
Outputs:
features type
0 {'type': 'Feature', 'geometry': {'type': 'Poin... FeatureCollection
1 {'type': 'Feature', 'geometry': {'type': 'Poin... FeatureCollection
2 {'type': 'Feature', 'geometry': {'type': 'Poin... FeatureCollection
3 {'type': 'Feature', 'geometry': {'type': 'Poin... FeatureCollection
4 {'type': 'Feature', 'geometry': {'type': 'Poin... FeatureCollection
For supported single-file formats or zipped shapefiles, you can use fiona.BytesCollection
and GeoDataFrame.from_features
:
import requests
import fiona
import geopandas as gpd
url = 'http://www.geopackage.org/data/gdal_sample.gpkg'
request = requests.get(url)
b = bytes(request.content)
with fiona.BytesCollection(b) as f:
crs = f.crs
gdf = gpd.GeoDataFrame.from_features(f, crs=crs)
print(gdf.head())
and for zipped shapefiles (supported as of fiona 1.7.2)
url = 'https://www2.census.gov/geo/tiger/TIGER2010/STATE/2010/tl_2010_31_state10.zip'
request = requests.get(url)
b = bytes(request.content)
with fiona.BytesCollection(b) as f:
crs = f.crs
gdf = gpd.GeoDataFrame.from_features(f, crs=crs)
print(gdf.head())
You can find out what formats Fiona supports using something like:
import fiona
for name, access in fiona.supported_drivers.items():
print('{}: {}'.format(name, access))
And a hacky workaround for reading in-memory zipped data in fiona 1.7.1 or earlier:
import requests
import uuid
import fiona
import geopandas as gpd
from osgeo import gdal
request = requests.get('https://github.com/OSGeo/gdal/blob/trunk/autotest/ogr/data/poly.zip?raw=true')
vsiz = '/vsimem/{}.zip'.format(uuid.uuid4().hex) #gdal/ogr requires a .zip extension
gdal.FileFromMemBuffer(vsiz,bytes(request.content))
with fiona.Collection(vsiz, vsi='zip', layer ='poly') as f:
gdf = gpd.GeoDataFrame.from_features(f, crs=f.crs)
print(gdf.head())
Correct answer by user2856 on March 8, 2021
Yes, it is possible now with Fiona (see https://github.com/Toblerity/Fiona/issues/409). I'm not sure if this feature is exposed yet in Geopandas.
Answered by sgillies on March 8, 2021
Since fiona.BytesCollection
doesn't seem to work for TopoJSON
here an solution that works for all without the need of gdal
:
import fiona
import geopandas as gpd
import requests
# parse the topojson file into memory
request = requests.get('https://vega.github.io/vega-datasets/data/us-10m.json')
visz = fiona.ogrext.buffer_to_virtual_file(bytes(request.content))
# read the features from a fiona collection into a GeoDataFrame
with fiona.Collection(visz, driver='TopoJSON') as f:
gdf = gpd.GeoDataFrame.from_features(f, crs=f.crs)
Answered by Mattijn on March 8, 2021
The easiest way is inputting the GeoJSON URL directly into the gpd.read_file() function. I'd tried extracting a shapefile from a zip before this using BytesIO & zipfile and had issues with gpd (specifically Fiona) accepting file-like objects.
import geopandas as gpd
import David.SQL_pull_by_placename as sql
import os
os.environ['PROJ_LIB'] = r'C:UserslittlexsparkeeAnaconda3Libraryshareproj'
geojson_url = f'https://github.com/loganpowell/census-geojson/blob/master/GeoJSON/500k/2018/{sql.state}/block-group.json?raw=true'
census_tracts_gdf = gpd.read_file(geojson_url)
Answered by littlexsparkee on March 8, 2021
When using Fiona 1.8, this can (must?) be done using that project's MemoryFile
or ZipMemoryFile
.
For example:
import fiona.io
import geopandas as gpd
import requests
response = requests.get('http://example.com/Some_shapefile.zip')
data_bytes = response.content
with fiona.io.ZipMemoryFile(data_bytes) as zip_memory_file:
with zip_memory_file.open('Some_shapefile.shp') as collection:
geodf = gpd.GeoDataFrame.from_features(collection, crs=collection.crs)
Answered by esmail on March 8, 2021
I prefer the result obtained by using the undocumented GeoDataFrame.from_features()
rather than passing the GeoJSON to the GDF constructor directly:
import geopandas as gpd
import requests
data = requests.get("https://data.cityofnewyork.us/api/geospatial/arq3-7z49?method=export&format=GeoJSON")
gpd.GeoDataFrame().from_features(data.json())
Output
geometry name url line objectid notes
0 POINT (-73.99107 40.73005) Astor Pl http://web.mta.info/nyct/service/ 4-6-6 Express 1 4 nights, 6-all times, 6 Express-weekdays AM s...
1 POINT (-74.00019 40.71880) Canal St http://web.mta.info/nyct/service/ 4-6-6 Express 2 4 nights, 6-all times, 6 Express-weekdays AM s...
2 POINT (-73.98385 40.76173) 50th St http://web.mta.info/nyct/service/ 1-2 3 1-all times, 2-nights
3 POINT (-73.97500 40.68086) Bergen St http://web.mta.info/nyct/service/ 2-3-4 4 4-nights, 3-all other times, 2-all times
4 POINT (-73.89489 40.66471) Pennsylvania Ave http://web.mta.info/nyct/service/ 3-4 5 4-nights, 3-all other times
The resulting GeoDataFrame has the geometry column set correctly and all the columns as I would expect, without needing to unnest any FeatureCollections
Answered by dericke on March 8, 2021
As indicated by @littlexsparkee, geopandas can now read known file formats directly from url's (this is possible since version 0.4), e.g.:
import geopandas as gpd
geojson_url = "https://data.cityofnewyork.us/api/geospatial/arq3-7z49?method=export&format=GeoJSON"
gdf1 = gpd.read_file(geojson_url)
gpkg_url = 'http://www.geopackage.org/data/gdal_sample.gpkg'
gdf2 = gpd.read_file(gpkg_url)
zip_url = 'https://www2.census.gov/geo/tiger/TIGER2010/STATE/2010/tl_2010_31_state10.zip'
gdf3 = gpd.read_file(zip_url)
Since Geopandas 0.8 it is also possible to directly read filelike objects. The example in the question now works for instance:
import geopandas as gpd
import io
import requests
request = requests.get("https://data.cityofnewyork.us/api/geospatial/arq3-7z49?method=export&format=GeoJSON")
gpd.read_file(io.BytesIO(request.content))
or, similarly, for a geopackage
request = requests.get('http://www.geopackage.org/data/gdal_sample.gpkg')
gpd.read_file(io.BytesIO(request.content))
(I have not managed to reproduce this for shapefiles or zip-files however.)
See the geopandas docs for some more examples.
Answered by onietosi on March 8, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP