Geographic Information Systems Asked by Olive on December 22, 2020
I am using python script and ArcGIS Desktop 10.8.1 to synchronize two datasets. There are many (thousands) of duplicate features that I would like to drop in an output. How can I specify that, in the case of identical features across datasets, I want to keep data from dmanfile
and delete duplicates from cadfile
. I am totally new to python, but here are the relevant parts of the code I have so far:
#input files from user console
dmanfile = input("./DSchemaFix1/D_Man_OG/D_Man_Fields_Complete.shp")
cadfile = input("./DSchemaFix1/CurrentCAD/CurrentCADFiles.shp")
gdf = gpd.read_file(dmanfile)
cad = gpd.read_file(cadfile)
gdf_appended = cad.append(gdf)
gdf_dupdropped = gdf_appended.drop_duplicates(keep='first', subset=['StreetName','Address', 'Apartment','ZipCode'])
Add a source column, sort by it and drop duplicates:
import geopandas as gpd
dman = gpd.read_file('/home/bera/Desktop/tempgis/dman.shp')
dman['source'] = 'dman'
cad = gpd.read_file('/home/bera/Desktop/tempgis/cadfile.shp')
cad['source'] = 'cad'
both = dman.append(cad)
no_dups = both.sort_values(by='source', ascending=False).drop_duplicates(subset=['StreetName','Address', 'Apartment','ZipCode'], keep='first') #dman come before cad and are kept
no_dups.to_file('/home/bera/Desktop/tempgis/nodups.shp')
Answered by BERA on December 22, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP