Watermln
Watermln

Reputation: 49

Is it possible to create ROW_NUMBER-column with repeating values?

I need to create an id-column: on every TRUE statement it should be a normal serial number, but on every FALSE statement the field value should contain the previous number in this column. In other words, every FALSE should repeat the previous value.

create table t1(column_id,column_bool)as values
   (1,true)
  ,(2,true)
  ,(3,false)
  ,(4,true)
  ,(5,true)
  ,(6,true)
  ,(7,false)
  ,(8,true);

I tried to give every FALSE-value its own number, but I don't know how to lower them by 1 after every single FALSE-iteration.

So here is the code:

SELECT *,CASE WHEN column_bool = true 
              THEN ROW_NUMBER()OVER(PARTITION BY column_bool ORDER BY column_id)
              ELSE ROW_NUMBER()OVER(ORDER BY column_id) -1
         END AS row_number_column
FROM t1
ORDER BY column_id

Here is what happens:

column_id column_bool row_number_column
1 t 1
2 t 2
3 F 2
4 t 3
5 t 4
6 t 5
7 F 6
8 t 6

And here is what I want to happen

column_id column_bool row_number_column
1 t 1
2 t 2
3 F 2
4 t 3
5 t 4
6 t 5
7 F 5
8 t 6

Every FALSE number should repeat the previous one. Is there any way to do it?

Upvotes: 4

Views: 124

Answers (4)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 658767

For completeness:

SELECT *, count(column_bool OR null) OVER (ORDER BY column_id) AS row_number_column
FROM   tbl;

fiddle

This works because count() only counts not-null values, and the Boolean expression column_bool OR null returns null for everything but true.

This was the fastest way before the aggregate FILTER clause was added with Postgres 9.4. (Zegarek provided a solution showing that.) Still works, of course. See:

Upvotes: 1

Zegarek
Zegarek

Reputation: 26467

If you set up a window that goes through your records in the order of column_id, you can keep count of rows, using filter to only count those where column_bool is true:
demo at db<>fiddle

select *,count(*)filter(where column_bool)over w1
from t1
window w1 as(order by column_id);
column_id column_bool row_number_column
1 t 1
2 t 2
3 F 2
4 t 3
5 t 4
6 t 5
7 F 5
8 t 6

The order specified in the window definition assumes a rows between unbounded preceding and current row frame, which means the count in each row becomes a stepping/tumbling/rolling count of stuff found from the first row up to the current one, not a grand total spanning the whole table:

The default framing option is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. With ORDER BY, this sets the frame to be all rows from the partition start up through the current row's last ORDER BY peer.

Counting only filtered values you're interested in is cheaper than casting all of them then summing, or emulating the filter clause with a conditional aggregate that also has to evaluate a case for all rows instead of only those of interest.

Upvotes: 4

p3consulting
p3consulting

Reputation: 4650

In PostgreSQL casting bool to int should also do the work:

SUM(column_bool::int) OVER (ORDER BY column_id) AS row_number_column

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522636

We can use COUNT() here as a window function:

SELECT *, 
    COUNT(CASE WHEN column_bool = true THEN 1 END)
        OVER (ORDER BY column_id) AS row_number_column
FROM t1
ORDER BY column_id;

The above logic computed a running count of true values over the table as ordered by the column_id. Note that the above sequence is 0 based.

Upvotes: 2

Related Questions