Hyder Tom
Hyder Tom

Reputation: 373

Redshift query data format

one of the column has data in format as below:

column_name_a:
abcd/date=2018-01-01/part-0001-asdfasdfasdf
abcd/date=2018-01-01/part-0002-asdfasdfasdf
abcd/date=2018-01-02/part-0001-asdfasdfasdf
abcd/date=2018-01-02/part-0002-asdfasdfasdf
abcd/date=2018-01-03/part-0001-asdfasdfasdf
abcd/date=2018-01-03/part-0002-asdfasdfasdf
abcd/date=2018-01-03/part-0003-asdfasdfasdf
abcd/date=2018-01-03/part-0004-asdfasdfasdf

.....

Now I need to get file count either by day or by part number.

How do I write my query?

Upvotes: 0

Views: 51

Answers (2)

Shailesh
Shailesh

Reputation: 2276

Adding to Nate's answer, you can use split_part multiple times to get what you require:

To get date:

select split_part(split_part('abcd/date=2018-01-01/part-0001-asdfasdfasdf','/',2),'=',2)     

To get part number:

select split_part(split_part('abcd/date=2018-01-01/part-0001-asdfasdfasdf','/',3),'-',2)

Upvotes: 1

Nate
Nate

Reputation: 210

Use split_part. This will still have 'date=' in the string..

date = split_part(column_name_a,'/',2)

part_number = split_part(column_name_a,'/',3)

details are here... https://docs.aws.amazon.com/redshift/latest/dg/SPLIT_PART.html

Upvotes: 0

Related Questions