Brian
Brian

Reputation: 151

SQL: Rename duplicate filenames by adding 1,2,3... before extension

I'm trying to figure out how to rename the contents of a column which stores a filename. Currently there are duplicates in this field which I'm trying to rename by adding a incremental integer after each duplicate, e.g.

ID       | FILENAME
----------------------
1        | file1.ext
2        | file2.ext
3        | file1.ext
4        | file1.ext
5        | file3.ext
6        | file3.ext
7        | file4.ext

So in the above example, I want the following to be unique:

ID       | FILENAME
----------------------
1        | file1.ext
3        | file1.ext
4        | file1.ext
5        | file3.ext
6        | file3.ext

By changing the filenames to:

ID       | FILENAME
----------------------
1        | file1-1.ext
3        | file1-2.ext
4        | file1-3.ext
5        | file3-1.ext
6        | file3-2.ext

I know how to find the duplicates but I'm not sure how to add increments to just before the extension or how to increment the count in the first place.

Any help would be greatly appreciated.

Upvotes: 1

Views: 2461

Answers (1)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 658082

Use the window function row_number() to get the number and replace for the string manipulation.
You did not disclose your RDBMS. The following query is tested on PostgreSQL 9.0. MySQL does not support window functions, most other big RDBMS do.

Rename all filenames:

SELECT id
      ,replace(filename, '.',  
               '-'
               || row_number() OVER (PARTITION BY filename ORDER BY id)
               || '.')
FROM   mytbl

Only rename duplicate filenames:

SELECT id
      ,CASE WHEN (count(*) OVER (PARTITION BY filename)) > 1 THEN
          replace(filename, '.',  
                  '-'
                  || row_number() OVER (PARTITION BY filename ORDER BY id)
                  || '.')
       ELSE filename END AS filename
FROM   mytbl;

Edit with additionally requested features

This version works with multiple or no dots in the name. Tested in PostgreSQL 9.0.

SELECT id
      ,CASE WHEN (count(*) OVER (PARTITION BY filename)) > 1 THEN
          regexp_replace(filename
              -- pick the longest string from the start not 
             ,'^([^.]*)'containing a '.'
              -- and replace it with itself + row_number
             ,E'\\1-' || row_number() OVER (PARTITION BY filename ORDER BY id))
       ELSE filename END AS filename
FROM   mytbl

Upvotes: 2

Related Questions