Calculating value of unique points in polygon in PostGIS

Question

I'm working with a spatial PostgreSQL database that contains species records, there is a point layer 'spp_points', each point has a "spp_ID" and "spp_score" field. I also have a polygon layer 'field_poly'.
I need to work out the sum of unique "spp_id" scores only within each polygon, so multiple records of the same species should only be counted once.
I have tried using the following query but this calculates the total of all species, so multiple instances of the same species are included in the sum:
SELECT site.gid AS field,
       SUM(spp.total) AS field_score,
       site.geom
FROM (SELECT DISTINCT spp_id,
             spp_score AS total,
             geom
      FROM spp_points) AS spp
JOIN field_poly AS site ON ST_Within(spp.geom, site.geom)
GROUP BY site.gid

I'm not sure where I need to go from here.

geozelot · Accepted Answer

Get the AVG of the spp_points.spp_score in a sub-query, i.e.
SELECT gid,
       SUM(spp_scores) AS field_score
FROM   (
  SELECT ply.gid,
         AVG(pnt.spp_score) AS spp_scores
  FROM   field_poly AS ply
  JOIN   spp_points AS pnt
    ON   ST_Intersects(ply.geom, pnt.geom)
  GROUP BY
         ply.gid, pnt.spp_id
) q
GROUP BY
       gid
;

This is semantically equal to
SELECT gid,
       SUM(spp_score) AS field_score
FROM   (
  SELECT DISTINCT ON (ply.gid, pnt.spp_id, pnt.spp_score)
         ply.gid,
         pnt.spp_id,
         pnt.spp_score
  FROM   field_poly AS ply
  JOIN   spp_points AS pnt
    ON   ST_Intersects(ply.geom, pnt.geom)
) q
GROUP BY
       gid
;

However, in cases where DISTINCT [ON] is interchangeable with GROUP BY, the latter usually outperforms the former.

robin loche · Answer

From what I understand, you're almost there. You just need to also group by "spp_id":
SELECT data.spp_id,
       SUM(data.spp_score) AS total_score,
       site.gid AS site_id,
       site.geom AS site_geom
FROM spp_points as data
INNER JOIN field_poly AS site ON ST_Within(data.geom, site.geom)
GROUP BY site.gid, site.geom, data.spp_id

DPSSpatial_BoycottingGISSE · Answer

I believe the crux of this problem is in assigning a row_number to all the sightings using the partition on the species ID and the site_gid they are in, and then ordering by the sighting date - which involves using a DATE column - then using only one of each sighting from each polygon.
I've done some mocking up of this using crime and neighborhood data for denver, but re-coded best I could to match your data - my species ID is instead a crime type, and I've used some dummy integer to mock up the spp_score.
Here's the code:
;with cte_data as (
select
 f.site_gid
, s.* 
 , row_number() over (partition by s."spp_ID", f.site_gid order by s.sighting_date desc ) as rownum
from spp_points as s 
    join field_poly as f on ST_INtersects(s.geom, f.geom)
)

select 
d.site_gid
, d."spp_ID"
, sum(d.spp_score)
from cte_data as d
where 1=1 and d.rownum = 1
group by d.site_gid, d."spp_ID"

Which gives this result, which is one species (crime) per field polygon and sums the spp_score (my dummy offense code integer).
I think its almost what you're after, but let me know what you think.

Calculating value of unique points in polygon in PostGIS

3 Answers

Add your own answers!

Ask a Question