dAnjou
dAnjou

Reputation: 3923

rsync --delete --files-from=list / dest/ does not delete unwanted files

As you can see in the title I try to sync a folder with a list of files. I hoped that this command would delete all files in dest/ that are not on the list, but it didn't.

So I searched a little bit and know now, that rsync can't do this.

But I need it, so do you know any way to do it?

PS: The list is created by a python script, so it is imaginable that your solution uses some python code.

EDIT, let's be concrete:

The list looks like this:

/home/max/Musik/Coldplay/Parachutes/Trouble.mp3
/home/max/Musik/Coldplay/Parachutes/Yellow.mp3
/home/max/Musik/Coldplay/A Rush of Blood to the Head/Warning Sign.mp3
/home/max/Musik/Coldplay/A Rush of B-Sides to Your Head/Help Is Around the Corner.mp3
/home/max/Musik/Coldplay/B-Sides (disc 3)/Bigger Stronger.mp3

and the command like this:

rsync --delete --files-from=/tmp/list / /home/max/Desktop/foobar/

This works, but if I delete a line, it is not deleted in foobar/.

EDIT 2:

rsync -r --include-from=/tmp/list --exclude=* --delete-excluded / /home/max/Desktop/foobar/

That works neither ...

Upvotes: 35

Views: 22907

Answers (8)

Dark
Dark

Reputation: 226

Based on 131's and m4t's answers, I took this approach:

  1. mv $dest $dest2
  2. mkdir $dest
  3. rsync (args incl --files-from...) --link-dest=$dest2 $dest2 $dest
  4. rm -rf $dest2

All operations are nearly free, no copying or temporary space required beyond filesystem bookkeeping

You can then run your usual rsync command from $source to $dest, and you will be left with exactly what is in files-from.

Upvotes: 0

131
131

Reputation: 3361

Inspired from m4t, but using ... rsync for cleanup

rsync -r --link-dest=$dest --files-from=filelist.txt user@server:$source/ $temp
rsync -ra --delete --link-dest=$temp $temp/ $dest

Upvotes: 4

Will Sheppard
Will Sheppard

Reputation: 3509

This is not exactly the solution, but people coming here might find this useful: Since rsync 3.1.0 there is a --delete-missing-args parameter which deletes files in the destination directory when you sync two directories using --files-from. You would need to specify the deleted files in /tmp/list along with files you do want copied:

rsync --delete-missing-args --files-from=/tmp/list /source/dir /destination/dir

See the man page for more details.

Upvotes: 13

cpbills
cpbills

Reputation: 101

I realize this question was asked a long time ago, but I wasn't satisfied with the answer.

Here is how I solved the problem, assuming a playlist created by mpd:

#!/bin/bash                                                                 

playlist_path="/home/cpbills/.config/mpd/playlists"
playlist="${playlist_path}/${1}.m3u"
music_src="/home/cpbills/files/music"
music_dst="/mnt/sdcard/music/"

if [[ -e "$playlist" ]]; then
  # Remove old files
  find "$music_dst" -type f | while read file; do
    name="$(echo "$file" | sed -e "s!^$music_dst!!")"
    if ! grep -qF "$name" "$playlist"; then
      rm "$file"
    fi
  done

  # Remove empty directories
  find "$music_dst" -type d -exec rmdir {} \; 2>/dev/null

  rsync -vu \
      --inplace \
      --files-from="$playlist" \
      "$music_src" "$music_dst"
else
  printf "%s does not exist\n" "$playlist" 1>&2
  exit 1
fi

Upvotes: 0

m4t
m4t

Reputation: 193

As you explained, the command

rsync -r --delete --files-from=$FILELIST user@server:/ $DEST/

does not delete content in the destination when an entry from $FILELIST has been removed. A simple solution is to use instead the following.

mkdir -p $DEST
rm -rf $TEMP
rsync -r --link-dest=$DEST --files-from=$FILELIST user@server:/ $TEMP/
rm -r $DEST
mv $TEMP $DEST

This instructs rsync to use an empty destination. Files that are already present in the link-dest-directory are locally hard-linked and not copied. Finally the old destination is replaced by the new one. The first mkdir creates an empty $DEST if $DEST doesn't exist, to prevent rsync error. (The $-variables are assumed to carry the full path to the respective file or directory.)

There is some minor overhead for the hard-linking, but you don't need to mess with complex include/exclude-strategies.

Upvotes: 11

dobrokot
dobrokot

Reputation: 126

Explicit build --exclude-from=... seems the only way to synchronize list of files.

stdin = subprocess.PIPE
other_params.append("--exclude-from=-") #from stdin 

p = subprocess.Popen( 'rsync -e ssh -zthvcr --compress-level=9 --delete'.split() + other_params + [src, dst], stdin =  PIPE)

if relative_files_list != None:
    #hack: listing of excluded files seems the only way to delete unwanted files at destination
    files = set(map(norm_fn, relative_files_list)) #make hash table, for huge lists
    for path, ds, fs in os.walk(src):
        for f in fs:
            rel_path_f = norm_fn(os.path.relpath(os.path.join(path, f), src))
            if rel_path_f not in files:
                #print 'excluding', rel_path_f.replace('\\', '/')
                p.stdin.write(rel_path_f + '\n')
    p.stdin.close()
assert 0 == p.wait()

Upvotes: 1

SimonJ
SimonJ

Reputation: 21316

Perhaps you could do this using a list of include patterns instead, and use --delete-excluded (which does as the name suggests)? Something like:

rsync -r --include-from=<patternlistfile> --exclude=* --delete-excluded / dest/

If filenames are likely to contain wildcard characters (*, ? and [) then you may need to modify the Python to escape them:

re.sub("([[*?])", r"\\\1", "abc[def*ghi?klm")

Edit: Pattern-based matching works slightly differently to --files-from in that rsync won't recurse into directories that match the exclude pattern, for reasons of efficiency. So if your files are in /some/dir and /some/other/dir then your pattern file needs to look like:

/some/
/some/dir/
/some/dir/file1
/some/dir/file2
/some/other/
/some/other/dir/
/some/other/dir/file3
...

Alternatively, if all files are in the same directory then you could rewrite the command slightly:

rsync -r --include-from=<patternlistfile> --exclude=* --delete-excluded /some/dir/ dest/

and then your patterns become:

/file1
/file2

Edit: Thinking about it, you could include all directories with one pattern:

/**/

but then you'd end up with the entire directory tree in dest/ which probably isn't what you want. But combining it with -m (which prunes empty directories) should solve that - so the command ends up something like:

rsync -m -r --delete-excluded --include-from=<patternfile> --exclude=* / dest/

and the pattern file:

/**/
/some/dir/file1
/some/other/dir/file3

Upvotes: 24

gahooa
gahooa

Reputation: 137592

rsync is ideal for keeping directories in sync, among other useful things. If you do have an exact copy on the SOURCE, and want to delete files on the DEST, you can delete them from SOURCE and the rsync --delete option will delete them from DEST also.

However, if you just have an arbitrary list of files you want to delete, I suggest you use SSH to accomplish that:

ssh [email protected] rm /path/to/file1 /path/to/file2

This will execute the rm command on the remote host.

Using python, you could:

import subprocess
FileList = ['/path/to/file1', '/path/to/file2']
subprocess.call(['ssh', '[email protected]', 'rm'] + FileList)

~enjoy

Upvotes: -1

Related Questions