Lemon
Lemon

Reputation: 67

Compare two folders containing source files & hardlinks, remove orphaned files

I am looking for a way to compare two folders containing source files and hard links (lets use /media/store/download and /media/store/complete as an example) and then remove orphaned files that don't exist in both folders. These files may have been renamed and may be stored in subdirectories.

I'd like to set this up on a cron script to run regularly. I just can't logically figure out myself how work the logic of the script - could anyone be so kind as to help?

Many thanks

Upvotes: 0

Views: 744

Answers (3)

Lars Lemberg
Lars Lemberg

Reputation: 46

rsync can do what you want, using the --existing, --ignore-existing, and --delete options. You'll have to run it twice, once in each "direction" to clean orphans from both source and target directories.

rsync -avn --existing --ignore-existing --delete /media/store/download/ /media/store/complete
rsync -avn --existing --ignore-existing --delete /media/store/complete/ /media/store/download

--existing says don't copy orphan files

--ignore-existing says don't update existing files

--delete says delete orphans on target dir

The trailing slash on the source dir, and no trailing slash on the target dir, are mandatory for your task.

The 'n' in -avn means not to really do anything, and I always do a "dry run" with the -n option to make sure the command is going to do what I want, ESPECIALLY when using --delete. Once you're confident your command is correct, run it with just -av to actually do the work.

Upvotes: 3

pRAShANT
pRAShANT

Reputation: 523

You can also use "diff" command to list down all the different files in two folders.

Upvotes: 0

Brian Agnew
Brian Agnew

Reputation: 272427

Perhaps rsync is of use ?

Rsync is a fast and extraordinarily versatile file copying tool. It can copy locally, to/from another host over any remote shell, or to/from a remote rsync daemon. It offers a large number of options that control every aspect of its behavior and permit very flexible specification of the set of files to be copied. It is famous for its delta-transfer algorithm, which reduces the amount of data sent over the network by sending only the differences between the source files and the existing files in the destination. Rsync is widely used for backups and mirroring and as an improved copy command for everyday use.

Note it has a --delete option

--delete                delete extraneous files from dest dirs

which could help with your specific use case above.

Upvotes: 2

Related Questions