Jscore
Jscore

Reputation: 405

How to correctly use the GROUP BY function in Postgresql

The below code works well, except for the step where I determine the threshold AS. Instead of calculating the count for each raster created in the polygon_dump AS step, it counts all the rasters together. I have been trying to use GROUP BY with limited success.

WITH
--  Select Features
    feat AS 
        (SELECT toid AS building_id,
                wkb_geometry AS geom 
                FROM buildings
        ),
    polygon_dump AS
        (SELECT (ST_DumpAsPolygons(ST_Clip(a.st_roughness,1,b.geom,-9999,true))).val AS polygon_vals,building_id AS building_id2
            FROM grosvenor_raster_roughness a, feat b
        ),
    threshold AS
        (SELECT Count(*) AS thres_val
            FROM polygon_dump
                WHERE polygon_vals >= 0 AND polygon_vals < 0.5
        GROUP BY building_id2
        ),
    b_stats AS
        (SELECT  building_id, (stats).*
            FROM (SELECT building_id, ST_SummaryStats(ST_Clip(a.st_roughness,1,b.geom,-9999,true)) AS stats
                FROM grosvenor_raster_roughness a
                    INNER JOIN feat b
                ON ST_Intersects(b.geom,a.st_roughness)
            ) AS foo
        )
--  Summarise statistics
    SELECT count As pixel_count,
           thres_val AS threshold_val,
           cast(thres_val as real)/cast(count as real)*100 AS percent_value,
           min AS min_pixel_val,
           max AS max_pixel_val,
           mean AS avg_pixel_val,
           stddev AS pixel_stddev
        FROM b_stats, threshold
            WHERE count > 0;

I get the following results:

enter image description here

The two columns in red are the correct results, what do I need to do to only get those results?

Upvotes: 0

Views: 56

Answers (1)

Łukasz Kamiński
Łukasz Kamiński

Reputation: 5930

You are doing CROSS JOIN. You need to add column with building_id into your threshold CTE, so you can JOIN it with b_stats. I'm not sure if it should be LEFT or INNER JOIN, so I'm gonna use INNER.

WITH
--  Select Features
    feat AS 
        (SELECT toid AS building_id,
                wkb_geometry AS geom 
                FROM buildings
        ),
    polygon_dump AS
        (SELECT (ST_DumpAsPolygons(ST_Clip(a.st_roughness,1,b.geom,-9999,true))).val AS polygon_vals,building_id AS building_id2
            FROM grosvenor_raster_roughness a, feat b
        ),
    threshold AS
        (SELECT building_id2 AS building_id, Count(*) AS thres_val
            FROM polygon_dump
                WHERE polygon_vals >= 0 AND polygon_vals < 0.5
        GROUP BY building_id2
        ),
    b_stats AS
        (SELECT  building_id, (stats).*
            FROM (SELECT building_id, ST_SummaryStats(ST_Clip(a.st_roughness,1,b.geom,-9999,true)) AS stats
                FROM grosvenor_raster_roughness a
                    INNER JOIN feat b
                ON ST_Intersects(b.geom,a.st_roughness)
            ) AS foo
        )
--  Summarise statistics
    SELECT count As pixel_count,
           thres_val AS threshold_val,
           cast(thres_val as real)/cast(count as real)*100 AS percent_value,
           min AS min_pixel_val,
           max AS max_pixel_val,
           mean AS avg_pixel_val,
           stddev AS pixel_stddev
        FROM b_stats
        JOIN threshold USING(building_id)
            WHERE count > 0;

Upvotes: 2

Related Questions