GeneralBear
GeneralBear

Reputation: 1021

Query postgres for distinct substrings

I have an inherited table with 120k rows including filenames. There are roughly 12,000 substrings replicated throughout the filenames. The filenames are each named according to a strange naming convention, but all based around the same substring. For example, one might be called 'apple-5pxl.mov' while another might be 'c-5pxl.mov' and still another might be '76_5pxl.mov'...while others might be 'apple-234x.mov' and '76_234x.mov'

How can I return a count and list of the number of distinct substrings? In the above example, there would be 2 (5xpl.mov and 234x.mov)

Upvotes: 1

Views: 980

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1271151

I think you can use substring() with a pattern argument:

select substring(filename from '[^-_]*$') as pat, count(*)
from t
group by pat;

The pattern seems to be anything at the end of the string delimited by a hyphen or underscore.

Upvotes: 2

Related Questions