TransWikia.com

PostGIS speed up query on 9 million row polygon table

Geographic Information Systems Asked on January 19, 2021

Table bl_data 9 million+ polygon rows with GIST index and the index is clustered

create index ii3 on bl_data using gist(geom);
CLUSTER ii3 ON bl_data;

I have a query that finds records with the same borough,block,lot and geometry

 select distinct a.id
 from bl_data a join bl_data b on st_equals(a.geom,b.geom) and a.yr<>b.yr 
 and a.borough='BX' and a.block=3805 and a.lot=7501

running explain analyze verbose yields

"Unique  (cost=0.43..237511571.55 rows=1 width=4) (actual time=7711.454..25624.703 rows=1 loops=1)"
"  Output: a.id"
"  ->  Nested Loop  (cost=0.43..237511560.46 rows=4437 width=4) (actual time=7711.453..25624.701 rows=1 loops=1)"
"        Output: a.id"
"        Join Filter: ((a.yr <> b.yr) AND st_equals(a.geom, b.geom))"
"        Rows Removed by Join Filter: 37741307"
"        ->  Index Scan using ii3_indx on public.bl_data a  (cost=0.43..1262864.82 rows=1 width=159) (actual time=7383.147..11020.128 rows=4 loops=1)"
"              Output: a.yr, a.borough, a.block, a.lot, a.geom, a.id, a.dupe"
"              Filter: (((a.borough)::text = 'BX'::text) AND (a.block = 3805) AND (a.lot = 7501))"
"              Rows Removed by Filter: 9435323"
"        ->  Seq Scan on public.bl_data b  (cost=0.00..331195.84 rows=9431984 width=155) (actual time=0.019..2132.602 rows=9435327 loops=4)"
"              Output: b.yr, b.borough, b.block, b.lot, b.geom, b.id, b.dupe"
"Planning Time: 1.167 ms"
"Execution Time: 25624.780 ms"

the query takes about 25 seconds to complete and I am planning to wrap this query into a function and run it on the entire table which will likely take weeks(?) to finish at this pace.

what steps can I take to speed this up? would indexing borough,block and lot be helpful? is the clustering not helpful?

One Answer

You can start with a fast bounding box intersection check, which will make use of the spatial index

... join bl_data b on a.geom && b.geom and st_equals(a.geom,b.geom) and ...

https://postgis.net/docs/geometry_overlaps.html

Correct answer by JGH on January 19, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP