kittymad
kittymad

Reputation: 93

Teradata - word frequency in a column

Say I have the following column in a teradata table:

Red ball
Purple ball
Orange ball

I want my output to be

Word    Count
Red     1
Ball    3
Purple  1
Orange  1

Thanks.

Upvotes: 1

Views: 1137

Answers (3)

dnoeth
dnoeth

Reputation: 60462

In TD14 there's a STRTOK_SPLIT_TO_TABLE function:

SELECT token, COUNT(*)
FROM TABLE (STRTOK_SPLIT_TO_TABLE(1 -- this is just a dummy, usually the PK column when you need to join
                                 ,table.stringcolumn
                                 ,' ') -- simply add other separating characters
     RETURNS (outkey INTEGER,
              tokennum INTEGER,
              token VARCHAR(100) CHARACTER SET UNICODE
             )
           ) AS d
GROUP BY 1

Upvotes: 4

Andrew
Andrew

Reputation: 8703

Here's how I would handle something like this:

  WITH RECURSIVE CTE  (POS, NEW_STRING, REAL_STRING) AS
(
SELECT
0, CAST('' AS VARCHAR(100)),TRIM(word)
FROM wordcount
UNION ALL
SELECT
CASE WHEN POSITION(' ' IN REAL_STRING) > 0
THEN POSITION(' ' IN REAL_STRING)
ELSE CHARACTER_LENGTH(REAL_STRING)
END DPOS,
TRIM(BOTH  ' ' FROM SUBSTR(REAL_STRING, 0, DPOS+1)),
TRIM(SUBSTR(REAL_STRING, DPOS+1))
FROM CTE
WHERE DPOS > 0
)

SELECT TRIM(NEW_STRING) as word,
count (*)
FROM CTE
group by word
WHERE pos > 0;

Which will return:

    word    Count(*)
    orange  1
    purple  1
    red 1
    ball    3

There may be an easier way with regex in 14, but I haven't messed with it yet.

EDIT: Removed some unneeded columns from the query.

Upvotes: 1

Change your table with this

name  |name2
_______________
red    |  ball
purple |  ball
orange |  ball
_______________

And then run the following query:

select name, count(name)as name1_count from table_test
group by name
union all
select name2,count(name2)as name2_count from table_test
group by name2;

Upvotes: 0

Related Questions