Stack Overflow Asked by zelusp on February 23, 2021
I’m trying to combine every block file from the 2010 census together into a single master block file for the US. I’m currently doing this in Google Colab and even on their pro subscription – which gives you about 25GB of RAM – I’m maxing out all available memory on the 45th file (I just have 5 more to go!). Code wise, I’m just building a list of dataframes that need to be concat
ed together and ultimately written to disk:
gdfs = []
census_blocks_basepath = r'/content/drive/My Drive/Census/blocks/'
census_block_filenames = [f for f in os.listdir(census_blocks_basepath) if f.endswith('.shp')]
for index, block_filename in enumerate(census_block_filenames):
file_name = os.path.join(census_blocks_basepath, block_filename)
gdfs.append(gpd.read_file(file_name))
print('Appended file %s, %s' % (index, block_filename))
gdf = gpd.GeoDataFrame(pd.concat(gdfs, ignore_index=True), crs=dataframesList[0].crs)
# gdf.reset_index(inplace=True, drop=True)
gdf.head(3)
Instead, I think I should:
1
(to avoid memory accrual)1
–3
for all geodataframes remaining in the source directoryI don’t see documentation on whether geopandas supports disk based appends.. it only seems able to overwrite previous files via GeoDataFrame.to_file
. That said, I see that geopandas has a GeoDataFrame.to_postgis
method with a chunksize
argument, which makes me think that it’s possible to append data onto a geofile on disk (or I’m wrong and that’s just a feature of postgis
).
Any ideas?
Yes, any file format which supports appending (and is supported by fiona) can be appended. You just have to specify mode="a".
df.to_file(filename, mode="a")
You can check if a mode is supported using
import fiona fiona.supported_drivers
This is the current result r-read, a-append, w-write.
{'AeronavFAA': 'r', 'ARCGEN': 'r', 'BNA': 'raw', 'DXF': 'raw', 'CSV': 'raw', 'OpenFileGDB': 'r', 'ESRIJSON': 'r', 'ESRI Shapefile': 'raw', 'GeoJSON': 'rw', 'GeoJSONSeq': 'rw', 'GPKG': 'rw', 'GML': 'raw', 'GPX': 'raw', 'GPSTrackMaker': 'raw', 'Idrisi': 'r', 'MapInfo File': 'raw', 'DGN': 'raw', 'PCIDSK': 'r', 'S57': 'r', 'SEGY': 'r', 'SUA': 'r', 'TopoJSON': 'r'}
Correct answer by zelusp on February 23, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP