TransWikia.com

PyQGIS is working too slow. Can I use GeoPandas?

Geographic Information Systems Asked on January 8, 2021

I have a layer with classification od buildings. I want to add "Area" to my attributes table and I use PyQGIS. For another layer with 50 records it works, but my layer is huge, contains 6.8mln record and it is working too slow. With another problem I used GeoPandas which is working so fast. I will paste my code and I need advice. Can I do it with GeoPandas?

from qgis.core import *
from qgis.utils import *
from qgis.analysis import QgsNativeAlgorithms
from PyQt5.QtCore import QVariant
from qgis.core import QgsApplication, QgsProcessingFeedback, QgsRasterLayer
import sys
import geopandas


sys.path.append('/usr/lib/qgis')
sys.path.append('/usr/share/qgis/python/plugins')
os.environ["QT_QPA_PLATFORM"] = "offscreen"


QgsApplication.setPrefixPath(r'/usr/bin/qgis', True)
qgs = QgsApplication([], False)
qgs.initQgis()


import processing
from processing.core.Processing import Processing


Processing.initialize()
QgsApplication.processingRegistry().addProvider(QgsNativeAlgorithms())
feedback = QgsProcessingFeedback()


#adding the Area field
layer = QgsVectorLayer(r'/home/gis/polskagisencoding.shp', "polskagisencoding", "ogr")
provider = layer.dataProvider()
area_field = QgsField("Area", QVariant.Int)
provider.addAttributes([area_field])
layer.updateFields()


#updating the Area field for each feature
idx = provider.fieldNameIndex('Area')
for feature in layer.getFeatures():
    attrs = {idx : int(feature.geometry().area())}
    layer.dataProvider().changeAttributeValues({feature.id() : attrs})

2 Answers

With GeoPandas

import geopandas as gpd
gdf = gpd.read_file('/home/gis/polskagisencoding.shp')
gdf["Area"] = gdf.geometry.area

But I'm not sure it's faster with larger shapefiles

Correct answer by gene on January 8, 2021

For large datasets like this, you may want to use dask-geopandas. It is still under development (no official release yet) but area should work flawlessly.

You should install geopandas, dask and pygeos and then dask-geopandas from git.

pip install git+git://github.com/jsignell/dask-geopandas.git

Then you can read your file with geopandas and convert it to dask.dataframe.

import geopandas
import dask_geopandas

df = geopandas.read_file('/home/gis/polskagisencoding.shp')

ddf = dask_geopandas.from_geopandas(df, npartitions=4)

areas = ddf.geometry.area.compute()

npartitions in this case should be the number of processors you want to use. dask-geopandas then does the computation in parallel.

See more here https://github.com/jsignell/dask-geopandas

Answered by martinfleis on January 8, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP