nrcombs
nrcombs

Reputation: 533

bash: use list of file names to concatenate matching files across directories and save all files in new directory

I have a large number of files that are found in three different directories. For some of these files, a file with an identical name exists in another directory. Other files exist in only one directory. I'd like to use bash to copy all of the files from the three directories to a single new directory, but for files with identically named files in more than one directory I want to concatenate the file contents across directories before saving to the new directory.

Here's an example of what my file structure looks like:

ls dir1/
file1.txt
file2.txt
file4.txt

ls dir2/
file2.txt
file5.txt
file6.txt
file9.txt

ls dir3/
file2.txt
file3.txt
file4.txt
file7.txt
file8.txt
file10.txt

Using this example, I'd like to produce a new directory that contains file1.txt through file10.txt, but with the contents of identically named files (e.g. file2.txt, file4.txt) concatenated in the new directory.

I have a unique list of all of the file names contained in my three directories (single instance of each unique file name is contained within the list). So far, I have come up with code to take a list of file names from one directory and concatenate these files with identically named files in a second directory, but I'm not sure how to use my list of file names as a reference for concatenating and saving files (instead of the output from ls in the first directory). Any ideas for how to modify? Thanks very much!

PATH1='/path/to/dir1'
PATH2='/path/to/dir2'
PATH3='/path/to/dir3'

mkdir dir_new

ls $PATH1 | while read FILE; do

    cat $PATH1/"$FILE" $PATH2/"$FILE" $PATH3/"$FILE" >> ./dir_new/"$FILE"

done

Upvotes: 1

Views: 1150

Answers (1)

thanasisp
thanasisp

Reputation: 5965

You can do it like this:

mkdir -p new_dir

for f in path/to/dir*/*.txt; do
    cat "$f" >> "new_dir/${f##*/}"
done

This is a common use for substring removal with parameter expansion, in order to use only the basename of the file to construct the output filename.


Or you can use a find command to get the files and execute the command for each one:

find path/to/dir* -type f -name '*.txt' -print0 |\
xargs -0 -n1 sh -c 'cat "$0" >> new_dir/"${0##*/}"'

In the above command, the filenames out of find are preserved with zero separation (-print0), and xargs also accepts a zero separated list (-0). For each argument (-n1) the command following is executed. We call sh -c 'command' for convenience to use the substring removal inside there, we can access the argument provided by xargs as $0.

Upvotes: 1

Related Questions