Jenifer
Jenifer

Reputation: 347

Need guidance in using LISTAGG with Regular Expression

SELECT
         ID_Col,
         lower(LISTAGG(distinct TEXT_COL,',')WITHIN GROUP(ORDER BY TEXT_COL)) 
         AS TEXT_COL_TXT
         FROM
         (SELECT
          CREATE_DT,
          ID_Col,
          TEXT_COL,
          TRY_CAST(Q_NO as INTEGER) as Q_NO
          FROM db_name.schema_name.tbl_name
          WHERE Flg = '0'
          AND date_of_cr = '2022-02-05'
          AND P_CODE NOT IN ('1','2','3','4')
          AND ID_Col IN('12345','23456')
          ORDER BY Q_NO)
          GROUP BY 1;

When I run the above query, I'm getting results like this:

ID_COL                                        TEXT_COL

12345                                 ::abcd::0,aforapple
23456                                 ::abcd::0,n:sometext:::empty::

I want this value to be removed in the result --> ::abcd::0,

The result should look like below:

ID_COL                                   TEXT_COL

12345                                 aforapple
23456                                 n:sometext:::empty::

Can anyone guide me how to bring such a result?

When I use the below logic, I could see comma in the results now:

LISTAGG(distinct iff(TEXT_COL = '::abcd::0', '', TEXT_COL),',')

Result I could see is:

ID_COL                                   TEXT_COL

12345                                 ,aforapple
23456                                 ,n:sometext:::empty::

I should not display comma in the results

Upvotes: 1

Views: 192

Answers (3)

Simeon Pilgrim
Simeon Pilgrim

Reputation: 25968

It would nice if you started accepting answershttps://stackoverflow.com/help/accepted-answer

It would also be nice if you posted minimal required SQL.

So you posted SQL

SELECT
    ID_Col,
    lower(LISTAGG(distinct TEXT_COL,',') WITHIN GROUP (ORDER BY TEXT_COL)) AS TEXT_COL_TXT
FROM
(
    SELECT
        CREATE_DT,
        ID_Col,
        TEXT_COL,
        TRY_CAST(Q_NO as INTEGER) as Q_NO
    FROM db_name.schema_name.tbl_name
    WHERE Flg = '0'
        AND date_of_cr = '2022-02-05'
        AND P_CODE NOT IN ('1','2','3','4')
        AND ID_Col IN('12345','23456')
    ORDER BY Q_NO
)
GROUP BY 1;

So because your filters on db_name.schema_name.tbl_name have zero impact on the LISTAGG question, those can be dropped. The ORDER BY should be removed, SQL-Server for example will fail this SQL, because it doesn't make a lot of sense to order a sub-select. Thus it can become:

SELECT
    ID_Col,
    lower(LISTAGG(distinct TEXT_COL,',') WITHIN GROUP (ORDER BY TEXT_COL)) AS TEXT_COL_TXT
FROM
(
    SELECT
        ID_Col,
        TEXT_COL,
    FROM db_name.schema_name.tbl_name
)
GROUP BY 1;

But actually that can become:

SELECT
    ID_Col,
    lower(LISTAGG(distinct TEXT_COL,',') WITHIN GROUP (ORDER BY TEXT_COL)) AS TEXT_COL_TXT
FROM db_name.schema_name.tbl_name
GROUP BY 1;

Now if you want to be friend you can provide some working data in a table

ID_COL TEXT_COL
12345 ::abcd::0
12345 aforapple
23456 ::abcd::0
23456 n:sometext:::empty::

Of you could provide the data in the small example query that you provide:

SELECT
    column1,
    lower(LISTAGG(distinct column2,',') WITHIN GROUP (ORDER BY column2)) AS TEXT_COL_TXT
FROM VALUES
    (12345, '::abcd::0'),
    (12345, 'aforapple'),
    (23456, '::abcd::0'),
    (23456, 'n:sometext:::empty::')
GROUP BY 1;

There is a huge benefit of pulling your SQL down it to the smallest reproducible example. Sometimes as you remove the unneeded bits, you can see the bigger picture and notice the mistake. Sometimes as you pull things out you undo a part that you didn't fully understand, and thus you have smaller code that works and small + just a little more that does not work, and that is enough tell you which commands needs to be reread in the help to understand the interactions.

Upvotes: 3

Rajat
Rajat

Reputation: 5803

You could use nullif which will set that column value to null if it matches ::abcd::0. The listagg will ignore the nulls in aggregation

listagg(distinct nullif(text_col,'::abcd::0'),',') within group (order by text_col)

Upvotes: 2

Marcel
Marcel

Reputation: 2612

Try to apply IFF(), which is similar to CASE WHEN: https://docs.snowflake.com/en/sql-reference/functions/iff.html

LISTAGG(distinct iff(TEXT_COL = '::abcd::0', '', TEXT_COL),',')

Logic described: If TEXT_COL = THEN use empty string ELSE use TEXT_COL for concatenation in LISTAGG

Upvotes: 2

Related Questions