Reputation: 1021
I have an inherited table with 120k rows including filenames. There are roughly 12,000 substrings replicated throughout the filenames. The filenames are each named according to a strange naming convention, but all based around the same substring. For example, one might be called 'apple-5pxl.mov' while another might be 'c-5pxl.mov' and still another might be '76_5pxl.mov'...while others might be 'apple-234x.mov' and '76_234x.mov'
How can I return a count and list of the number of distinct substrings? In the above example, there would be 2 (5xpl.mov and 234x.mov)
Upvotes: 1
Views: 980
Reputation: 1271151
I think you can use substring()
with a pattern argument:
select substring(filename from '[^-_]*$') as pat, count(*)
from t
group by pat;
The pattern seems to be anything at the end of the string delimited by a hyphen or underscore.
Upvotes: 2