Geographic Information Systems Asked on December 15, 2021
Is it possible to read raw data into a geopandas
GeoDataFrame
, a la a pandas
DataFrame
?
For example, the following works:
import io
import pandas as pd
import requests
data = requests.get("https://data.cityofnewyork.us/api/geospatial/arq3-7z49?method=export&format=GeoJSON")
pd.read_json(io.BytesIO(data.content))
The following does not:
import geopandas as gpd
import io
import requests
data = requests.get("https://data.cityofnewyork.us/api/geospatial/arq3-7z49?method=export&format=GeoJSON")
gpd.read_file(io.BytesIO(data.content))
In other words, is it possible to read geospatial data that’s in memory without saving that data to disk first?
As indicated by @littlexsparkee, geopandas can now read known file formats directly from url's (this is possible since version 0.4), e.g.:
import geopandas as gpd
geojson_url = "https://data.cityofnewyork.us/api/geospatial/arq3-7z49?method=export&format=GeoJSON"
gdf1 = gpd.read_file(geojson_url)
gpkg_url = 'http://www.geopackage.org/data/gdal_sample.gpkg'
gdf2 = gpd.read_file(gpkg_url)
zip_url = 'https://www2.census.gov/geo/tiger/TIGER2010/STATE/2010/tl_2010_31_state10.zip'
gdf3 = gpd.read_file(zip_url)
Since Geopandas 0.8 it is also possible to directly read filelike objects. The example in the question now works for instance:
import geopandas as gpd
import io
import requests
request = requests.get("https://data.cityofnewyork.us/api/geospatial/arq3-7z49?method=export&format=GeoJSON")
gpd.read_file(io.BytesIO(request.content))
or, similarly, for a geopackage
request = requests.get('http://www.geopackage.org/data/gdal_sample.gpkg')
gpd.read_file(io.BytesIO(request.content))
(I have not managed to reproduce this for shapefiles or zip-files however.)
See the geopandas docs for some more examples.
Answered by onietosi on December 15, 2021
I prefer the result obtained by using the undocumented GeoDataFrame.from_features()
rather than passing the GeoJSON to the GDF constructor directly:
import geopandas as gpd
import requests
data = requests.get("https://data.cityofnewyork.us/api/geospatial/arq3-7z49?method=export&format=GeoJSON")
gpd.GeoDataFrame().from_features(data.json())
Output
geometry name url line objectid notes
0 POINT (-73.99107 40.73005) Astor Pl http://web.mta.info/nyct/service/ 4-6-6 Express 1 4 nights, 6-all times, 6 Express-weekdays AM s...
1 POINT (-74.00019 40.71880) Canal St http://web.mta.info/nyct/service/ 4-6-6 Express 2 4 nights, 6-all times, 6 Express-weekdays AM s...
2 POINT (-73.98385 40.76173) 50th St http://web.mta.info/nyct/service/ 1-2 3 1-all times, 2-nights
3 POINT (-73.97500 40.68086) Bergen St http://web.mta.info/nyct/service/ 2-3-4 4 4-nights, 3-all other times, 2-all times
4 POINT (-73.89489 40.66471) Pennsylvania Ave http://web.mta.info/nyct/service/ 3-4 5 4-nights, 3-all other times
The resulting GeoDataFrame has the geometry column set correctly and all the columns as I would expect, without needing to unnest any FeatureCollections
Answered by dericke on December 15, 2021
When using Fiona 1.8, this can (must?) be done using that project's MemoryFile
or ZipMemoryFile
.
For example:
import fiona.io
import geopandas as gpd
import requests
response = requests.get('http://example.com/Some_shapefile.zip')
data_bytes = response.content
with fiona.io.ZipMemoryFile(data_bytes) as zip_memory_file:
with zip_memory_file.open('Some_shapefile.shp') as collection:
geodf = gpd.GeoDataFrame.from_features(collection, crs=collection.crs)
Answered by esmail on December 15, 2021
The easiest way is inputting the GeoJSON URL directly into the gpd.read_file() function. I'd tried extracting a shapefile from a zip before this using BytesIO & zipfile and had issues with gpd (specifically Fiona) accepting file-like objects.
import geopandas as gpd
import David.SQL_pull_by_placename as sql
import os
os.environ['PROJ_LIB'] = r'C:UserslittlexsparkeeAnaconda3Libraryshareproj'
geojson_url = f'https://github.com/loganpowell/census-geojson/blob/master/GeoJSON/500k/2018/{sql.state}/block-group.json?raw=true'
census_tracts_gdf = gpd.read_file(geojson_url)
Answered by littlexsparkee on December 15, 2021
Since fiona.BytesCollection
doesn't seem to work for TopoJSON
here an solution that works for all without the need of gdal
:
import fiona
import geopandas as gpd
import requests
# parse the topojson file into memory
request = requests.get('https://vega.github.io/vega-datasets/data/us-10m.json')
visz = fiona.ogrext.buffer_to_virtual_file(bytes(request.content))
# read the features from a fiona collection into a GeoDataFrame
with fiona.Collection(visz, driver='TopoJSON') as f:
gdf = gpd.GeoDataFrame.from_features(f, crs=f.crs)
Answered by Mattijn on December 15, 2021
Yes, it is possible now with Fiona (see https://github.com/Toblerity/Fiona/issues/409). I'm not sure if this feature is exposed yet in Geopandas.
Answered by sgillies on December 15, 2021
You can pass the json directly to the GeoDataFrame constructor:
import geopandas as gpd
import requests
data = requests.get("https://data.cityofnewyork.us/api/geospatial/arq3-7z49?method=export&format=GeoJSON")
gdf = gpd.GeoDataFrame(data.json())
gdf.head()
Outputs:
features type
0 {'type': 'Feature', 'geometry': {'type': 'Poin... FeatureCollection
1 {'type': 'Feature', 'geometry': {'type': 'Poin... FeatureCollection
2 {'type': 'Feature', 'geometry': {'type': 'Poin... FeatureCollection
3 {'type': 'Feature', 'geometry': {'type': 'Poin... FeatureCollection
4 {'type': 'Feature', 'geometry': {'type': 'Poin... FeatureCollection
For supported single-file formats or zipped shapefiles, you can use fiona.BytesCollection
and GeoDataFrame.from_features
:
import requests
import fiona
import geopandas as gpd
url = 'http://www.geopackage.org/data/gdal_sample.gpkg'
request = requests.get(url)
b = bytes(request.content)
with fiona.BytesCollection(b) as f:
crs = f.crs
gdf = gpd.GeoDataFrame.from_features(f, crs=crs)
print(gdf.head())
and for zipped shapefiles (supported as of fiona 1.7.2)
url = 'https://www2.census.gov/geo/tiger/TIGER2010/STATE/2010/tl_2010_31_state10.zip'
request = requests.get(url)
b = bytes(request.content)
with fiona.BytesCollection(b) as f:
crs = f.crs
gdf = gpd.GeoDataFrame.from_features(f, crs=crs)
print(gdf.head())
You can find out what formats Fiona supports using something like:
import fiona
for name, access in fiona.supported_drivers.items():
print('{}: {}'.format(name, access))
And a hacky workaround for reading in-memory zipped data in fiona 1.7.1 or earlier:
import requests
import uuid
import fiona
import geopandas as gpd
from osgeo import gdal
request = requests.get('https://github.com/OSGeo/gdal/blob/trunk/autotest/ogr/data/poly.zip?raw=true')
vsiz = '/vsimem/{}.zip'.format(uuid.uuid4().hex) #gdal/ogr requires a .zip extension
gdal.FileFromMemBuffer(vsiz,bytes(request.content))
with fiona.Collection(vsiz, vsi='zip', layer ='poly') as f:
gdf = gpd.GeoDataFrame.from_features(f, crs=f.crs)
print(gdf.head())
Answered by user2856 on December 15, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP