Geographic Information Systems Asked on January 8, 2021
I have a layer with classification od buildings. I want to add "Area" to my attributes table and I use PyQGIS. For another layer with 50 records it works, but my layer is huge, contains 6.8mln record and it is working too slow. With another problem I used GeoPandas which is working so fast. I will paste my code and I need advice. Can I do it with GeoPandas?
from qgis.core import *
from qgis.utils import *
from qgis.analysis import QgsNativeAlgorithms
from PyQt5.QtCore import QVariant
from qgis.core import QgsApplication, QgsProcessingFeedback, QgsRasterLayer
import sys
import geopandas
sys.path.append('/usr/lib/qgis')
sys.path.append('/usr/share/qgis/python/plugins')
os.environ["QT_QPA_PLATFORM"] = "offscreen"
QgsApplication.setPrefixPath(r'/usr/bin/qgis', True)
qgs = QgsApplication([], False)
qgs.initQgis()
import processing
from processing.core.Processing import Processing
Processing.initialize()
QgsApplication.processingRegistry().addProvider(QgsNativeAlgorithms())
feedback = QgsProcessingFeedback()
#adding the Area field
layer = QgsVectorLayer(r'/home/gis/polskagisencoding.shp', "polskagisencoding", "ogr")
provider = layer.dataProvider()
area_field = QgsField("Area", QVariant.Int)
provider.addAttributes([area_field])
layer.updateFields()
#updating the Area field for each feature
idx = provider.fieldNameIndex('Area')
for feature in layer.getFeatures():
attrs = {idx : int(feature.geometry().area())}
layer.dataProvider().changeAttributeValues({feature.id() : attrs})
With GeoPandas
import geopandas as gpd
gdf = gpd.read_file('/home/gis/polskagisencoding.shp')
gdf["Area"] = gdf.geometry.area
But I'm not sure it's faster with larger shapefiles
Correct answer by gene on January 8, 2021
For large datasets like this, you may want to use dask-geopandas
. It is still under development (no official release yet) but area
should work flawlessly.
You should install geopandas
, dask
and pygeos
and then dask-geopandas
from git.
pip install git+git://github.com/jsignell/dask-geopandas.git
Then you can read your file with geopandas
and convert it to dask.dataframe
.
import geopandas
import dask_geopandas
df = geopandas.read_file('/home/gis/polskagisencoding.shp')
ddf = dask_geopandas.from_geopandas(df, npartitions=4)
areas = ddf.geometry.area.compute()
npartitions
in this case should be the number of processors you want to use. dask-geopandas
then does the computation in parallel.
See more here https://github.com/jsignell/dask-geopandas
Answered by martinfleis on January 8, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP