Geographic Information Systems Asked on March 29, 2021
I have run into some severe performance degradation when upgrading an environment to GDAL3. I could track the issue down to fiona.transform
, which is a lot slower (about 15 (!) times) now than it was with GDAL 2.4.
The issue can be illustrated using this line, which only transform one point (the actual script transform a geometry):
python -m timeit -s "from fiona.transform import transform" "transform('EPSG:31287', 'EPSG:4236', [419908], [333400])"
These are my performance measurements with the images from perrygeo/gdal-base and using fiona 1.8.13
:
latest Python 3.8.5 | GDAL 3.1.3 | GEOS 3.8.1 | PROJ 7.1.1 | 20 loops, best of 5: 20 msec per loop
20181219-6f5f6a29 Python 3.6.7 | GDAL 2.4.0 | GEOS 3.7.1 | PROJ 5.2.0 | 1000 loops, best of 3: 675 usec per loop
20181219-f379ec62 Python 3.6.7 | GDAL 2.4.0 | GEOS 3.7.1 | PROJ 5.2.0 | 1000 loops, best of 3: 705 usec per loop
20181221-40f73e30 Python 3.6.7 | GDAL 2.4.0 | GEOS 3.7.1 | PROJ 5.2.0 | 1000 loops, best of 3: 698 usec per loop
20181221-bc2d4bbd Python 3.6.7 | GDAL 2.4.0 | GEOS 3.7.1 | PROJ 5.2.0 | 1000 loops, best of 3: 688 usec per loop
20181221-f7a0a299 Python 3.6.7 | GDAL 2.4.0 | GEOS 3.7.1 | PROJ 5.2.0 | 1000 loops, best of 3: 703 usec per loop
20190312-f69f8699 Python 3.6.8 | GDAL 2.4.0 | GEOS 3.7.1 | PROJ 6.0.0 | 1000 loops, best of 3: 634 usec per loop
20190322-800eed8a Python 3.6.8 | GDAL 2.4.1 | GEOS 3.7.1 | PROJ 6.0.0 | 1000 loops, best of 3: 589 usec per loop
20190509-da2e635a Python 3.6.8 | GDAL 3.0.0 | GEOS 3.7.2 | PROJ 6.0.0 | 100 loops, best of 3: 10.4 msec per loop
20191110-6cc84c7e Python 3.6.9 | GDAL 3.0.2 | GEOS 3.8.0 | PROJ 6.2.1 | 100 loops, best of 3: 11.4 msec per loop
20200301-8437abbb Python 3.8.2 | GDAL 3.0.4 | GEOS 3.8.0 | PROJ 7.0.0 | 20 loops, best of 5: 10.5 msec per loop
20200509-50546ca8 Python 3.8.2 | GDAL 3.1.0 | GEOS 3.8.1 | PROJ 7.0.1 | 20 loops, best of 5: 10.2 msec per loop
20200907-c7ec91bc Python 3.8.5 | GDAL 3.1.3 | GEOS 3.8.1 | PROJ 7.1.1 | 20 loops, best of 5: 10.8 msec per loop
Once can clearly see that the line performs at ~0.7 msec before GDAL3 and beginning with GDAL3 the line takes >10 msec to finish.
Does anyone have a hint, what could be the root of the issue and how it could be fixed?
I would recommend using pyproj as it has dealt with this issue already: https://pyproj4.github.io/pyproj/stable/advanced_examples.html#optimize-transformations
The creation of the transformer has more overhead in PROJ 6+. That is why pyproj added the Transformer class. See: https://github.com/pyproj4/pyproj/issues/187
Correct answer by snowman2 on March 29, 2021
Indeed, as @snowman2 points out, using pyproj
fixes the performance issue. The relevant command would look like this (for more complex geometries use shapely.ops.transform
):
python -m timeit -s "from pyproj import Transformer" -s "transform = Transformer.from_crs(31287, 4236).transform" "transform(419908, 333400)"
It sets up a pyproj.Transformer
that is being reused by the transformations.
The benchmark looks like this:
latest Python 3.8.5 | GDAL 3.1.3 | GEOS 3.8.1 | PROJ 7.1.1 | 10000 loops, best of 5: 20.7 usec per loop
20181219-6f5f6a29 Python 3.6.7 | GDAL 2.4.0 | GEOS 3.7.1 | PROJ 5.2.0 | 10000 loops, best of 3: 29.8 usec per loop
20181219-f379ec62 Python 3.6.7 | GDAL 2.4.0 | GEOS 3.7.1 | PROJ 5.2.0 | 10000 loops, best of 3: 31.2 usec per loop
20181221-40f73e30 Python 3.6.7 | GDAL 2.4.0 | GEOS 3.7.1 | PROJ 5.2.0 | 10000 loops, best of 3: 56.9 usec per loop
20181221-bc2d4bbd Python 3.6.7 | GDAL 2.4.0 | GEOS 3.7.1 | PROJ 5.2.0 | 10000 loops, best of 3: 33.3 usec per loop
20181221-f7a0a299 Python 3.6.7 | GDAL 2.4.0 | GEOS 3.7.1 | PROJ 5.2.0 | 10000 loops, best of 3: 56.4 usec per loop
20190312-f69f8699 Python 3.6.8 | GDAL 2.4.0 | GEOS 3.7.1 | PROJ 6.0.0 | 10000 loops, best of 3: 24.7 usec per loop
20190322-800eed8a Python 3.6.8 | GDAL 2.4.1 | GEOS 3.7.1 | PROJ 6.0.0 | 10000 loops, best of 3: 83.9 usec per loop
20190509-da2e635a Python 3.6.8 | GDAL 3.0.0 | GEOS 3.7.2 | PROJ 6.0.0 | 10000 loops, best of 3: 43.7 usec per loop
20191110-6cc84c7e Python 3.6.9 | GDAL 3.0.2 | GEOS 3.8.0 | PROJ 6.2.1 | 10000 loops, best of 3: 58.6 usec per loop
20200301-8437abbb Python 3.8.2 | GDAL 3.0.4 | GEOS 3.8.0 | PROJ 7.0.0 | 10000 loops, best of 5: 12 usec per loop
20200509-50546ca8 Python 3.8.2 | GDAL 3.1.0 | GEOS 3.8.1 | PROJ 7.0.1 | 20000 loops, best of 5: 10.5 usec per loop
20200907-c7ec91bc Python 3.8.5 | GDAL 3.1.3 | GEOS 3.8.1 | PROJ 7.1.1 | 20000 loops, best of 5: 11.2 usec per loop
PS.: This is a thousand (!) times faster in GDAL 3/PROJ 7 than the fiona
approach from the question.
Answered by Stefan on March 29, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP