Chris Curvey
Chris Curvey

Reputation: 10389

split out file name from path in postgres

I have a field that contains windows file paths, like so:

\\fs1\foo\bar\snafu.txt
c:\this\is\why\i\drink\snafu.txt
\\fs2\bippity\baz.zip
\\fs3\boppity\boo\baz.zip
c:\users\chris\donut.c

What I need to do is find then number of duplicated files names (regardless of what directory they are in). So I want to find "snafu.txt" and "baz.zip", but not donut.c.

Is there a way in PostgreSQL (8.4) to find the last part of a file path? If I can do that, then I can use count/group to find my problem children.

Upvotes: 10

Views: 12505

Answers (3)

James Doherty
James Doherty

Reputation: 1411

CREATE OR REPLACE FUNCTION basename(text) RETURNS text
    AS $basename$
declare
    FILE_PATH alias for $1;
    ret         text;
begin
    ret := regexp_replace(FILE_PATH,'^.+[/\\]', '');
    return ret;
end;
$basename$ LANGUAGE plpgsql;

Upvotes: 2

András Váczi
András Váczi

Reputation: 3002

You can easily strip the path up to the last directory separator with an expression like

regexp_replace(path, '^.+[/\\]', '')

This will match the ocassional forward slashes produced by some software as well. Then you just count the remaining file names like

WITH files AS (
    SELECT regexp_replace(my_path, '^.+[/\\]', '') AS filename
    FROM my_table
)
SELECT filename, count(*) AS count
FROM files
GROUP BY filename
HAVING count(*) >= 2;

Upvotes: 16

Alex Howansky
Alex Howansky

Reputation: 53656

select regexp_replace(path_field, '.+/', '') from files_table;

Upvotes: 1

Related Questions