TransWikia.com

Create a PL/PSQL function that performs as fast as a normal query?

Geographic Information Systems Asked by Jonas Olsen on December 19, 2020

I’m trying to create a PL/psql function that cuts one user-defined geometry to a predefined border geometry.
If the user-defined geometry does not intersect with the border geometry, it should return the user geometry, unmodified.

The pre-defined geometry consists of 47 polygons. and it has a geometry-index.

The issue isn’t with creating the function, its with the speed of the function.

The function:

CREATE OR REPLACE FUNCTION public.clip_geometry(
   IN in_geom geometry,
   OUT out_geom geometry
   ) AS
-- function returns unmodified input-geometry if it does not intersect with the border-geometry. 
-- if input-geometry intersects with the border-geometry, the input-geometry is cut according to the border-geometry
$BODY$
select 
    st_multi(coalesce(
                st_difference(a.geom ,dk.geom), 
                a.geom)) 
        from (select $1 as geom) a
        left JOIN LATERAL
        (select st_union(b.geom) geom from 
        b.border b
        where st_intersects(a.geom, b.geom)
        ) dk on TRUE;

$BODY$
  LANGUAGE SQL STABLE;

I have tested this function (and a version that was written in PL/pgSQL), and it behaves as I want it to. However, the function is slow compared to a normal query.
When I create a table using the function, with a table of road-geometries (approximately 2.3 million rows, with a geometry-index) the runtime is around 1 to 1:30 minute, using this query:

create table b.mod_roads as  
select clip_geometry(geom) from b.roads

If I wrote the functions content as a normal query, the runtime is around 15-25 seconds.
the query:

create table b.mod_roads as 
select 
    st_multi(coalesce(
                st_difference(a.geom ,dk.geom), 
                a.geom)) 
        from b.roads a
        left JOIN LATERAL
        (select st_union(b.geom) geom from 
        b.border b
        where st_intersects(a.geom, b.geom)
        ) dk on TRUE;

The normal query uses the border-geometries’ index.
When I try to use EXPLAIN on the function-call, it doesn’t show how it optimizes the query.

Is there a way to achieve the normal query’s speed within the function?

One Answer

Functions and procedures in the PostgreSQL environment are costly due to their overhead, compared to the directly interpreted SQL. And functions are, indeed, a performance fence (or rather, a non-optimizable container) for the planner, in particular cases, and especially when used in complex statements.

While there are plenty of scenarios where a complex query can get outperformed by a simple procedural function statement, you should always expect a fraction of a millisecond of overhead when invoking a function compared to an equal SQL statement.


However:

Your issue is rather the usage of the function; the way you call it the function gets executed once per row. This is fundamentally different to what the actual SQL does, and is not going to get optimized. As a matter of fact, you won't be able to mimic the SQL statement in a manner that will get you similar performance, as you need to always work on the row level, or use an aggregate over the whole table

But you can at least simplify your function, make it use the index and save on some computations, and the LATERAL join in the function body is superfluous due to the fact that each rows geometry is passed in individually anyways; if I get your intention right, you probably want to use sth. like this:

CREATE OR REPLACE FUNCTION public.clip_geom(
  IN  in_geom GEOMETRY,
  OUT out_geom GEOMERTY
) LANGUAGE SQL AS
  $$
    SELECT CASE ST_CoveredBy($1, geom)
             WHEN TRUE THEN ST_Multi($1)
             ELSE ST_Multi(ST_Intersection($1, geom))
           END
    FROM   (
      SELECT ST_Union(geom) AS geom
      FROM   border
      WHERE  ST_Intersects($1, geom)
    ) q
    ;
  $$
;

This is most performantly copying the behavior of your plain SQL query; this should be good in an UPDATE or INSERT when used with a pre-filter on ST_Intersects, to avoid executing the function on each row. If a new table is what you are after, better use two INSERTs: one with geometries that are ST_CoveredBy, and one with the clipped rest, leaving out any checks at all.

This should probably be part of a trigger to avoid filling up the DB with unclipped geometries in the first place. Or it doesn't make much sense to use a function at all.

Correct answer by geozelot on December 19, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP