Reputation: 26046
I would like to find the file names of the files that change in the test case at the bottom of this post.
It outputs
before
d41d8cd98f00b204e9800998ecf8427e FFF/c.txt
d41d8cd98f00b204e9800998ecf8427e FFF/a.txt
d41d8cd98f00b204e9800998ecf8427e FFF/b.txt
after
d41d8cd98f00b204e9800998ecf8427e FFF/c.txt
d41d8cd98f00b204e9800998ecf8427e FFF/d.txt
d8e8fca2dc0f896fd7cb4cb0031ba249 FFF/b.txt
Question
How do I get the file names of the files that have changed?
In this case a.txt
have been deleted, d.txt
have been added, and b.txt
have changed md5sum.
#!/bin/bash
mkdir -p FFF
touch FFF/a.txt
rm -f FFF/b.txt
touch FFF/b.txt
touch FFF/c.txt
rm -f FFF/d.txt
echo "before"
find FFF -name "*.txt" -exec md5sum '{}' \;
echo ""
# makes some changes that I want to catch
rm -f FFF/a.txt
echo "test" > FFF/b.txt
touch FFF/d.txt
echo "after"
find FFF -name "*.txt" -exec md5sum '{}' \;
Upvotes: 3
Views: 588
Reputation: 765
Another alternative is to use a file system watcher such as inotify, dnotify, fam, or gamin. Examples:
inotifywait -m /home/david
dnotify -all -r /home/david
Add options to perform certain commands or pipe their output to a read/process loop.
Upvotes: 0
Reputation: 107090
Okay, what's your setup?
diff -R
will show you what was added, deleted, and modified in the directories involved. You may have to use diffdir
or dirdiff
on Solarisfind $dir -mtime
. This will show you files found where the timestamp is newer (or older) than -mtime
.For example:
$ find $dir -mtime +3
Will find files older than three days old while:
$ find $dir -mtime -3
will find files younger than three days old. Some systems also have -mmin
for checking for minutes.
If you're looking for a changes that have taken place in some random snapshot of time, then I suggest you look into using a version control system. A good version control system will give you the flexibility you want without having to reinvent the wheel. A single command (like svn log -rPREV:HEAD -v
) can give you everything you need.
The two most popular version control systems are Subversion and Git. I find Subversion to be easier to use and setup, but Git is better if you have to share your code with others and don't have a central server. Baazar has a nice interface and is also fairly simple. I'm just starting to play with it.
Upvotes: 2
Reputation: 86974
If you store the output of both find
commands into temp files, you can run diff
on them to figure out the files that has changed. A sample output would be:
[me@home]$ diff -u ori.temp new.temp | tail -n+4 | grep "^[-+]" | sort -k2
-d41d8cd98f00b204e9800998ecf8427e FFF/a.txt
-d41d8cd98f00b204e9800998ecf8427e FFF/b.txt
+d41d8cd98f00b204e9800998ecf8427e FFF/d.txt
+d8e8fca2dc0f896fd7cb4cb0031ba249 FFF/b.txt
You should be able to parse that output to determine the changed files. The 2nd column gives you the file names. Lines that start with -
are deletions (unless a corresponding +
exists, which means it's an edit) while Lines that start with +
are additions.
The tailing sort -k2
sorts the output by the 2nd column making it easier to locate edits (duplicate appearance of file).
Parsing the output of diff can be done quite easily with a handful of awk
or even pure bash. Unfortunately, my bash/awk-fu is not up to par, so here's my take on your script which uses a smattering of Python.
#!/bin/bash
# set up initial state
mkdir -p FFF && touch FFF/a.txt && rm -f FFF/b.txt
touch FFF/b.txt FFF/c.txt && rm -f FFF/d.txt
# capture current state
TMP_ORI="$RANDOM.ori.tmp"
find FFF -name "*.txt" -exec md5sum '{}' \; > $TMP_ORI
# makes some changes that I want to catch
rm -f FFF/a.txt && echo "test" > FFF/b.txt && touch FFF/d.txt
# capture new state
TMP_NEW="$RANDOM.new.tmp"
find FFF -name "*.txt" -exec md5sum '{}' \; > $TMP_NEW
# run diff and parse output
diff -u $TMP_ORI $TMP_NEW | tail -n+4 | grep "^[-+]" | python -c '
import fileinput
modes = {"+" : "added", "-" : "removed" }
visited = {}
for line in fileinput.input(): # for each line from stdin
checksum, file = line.split() # split the columns
if file in visited:
visited[file] = "modified" # file appeared before
else:
visited[file] = modes[checksum[0]] # map "+/-" to "added/removed"
for file, mode in visited.iteritems(): # print results
print "%s\t%s" % (file, mode)
'
rm $TMP_ORI $TMP_NEW # delete temp files
Running this script will give the following output:
[me@home] ./sandras_script.sh
FFF/d.txt added
FFF/a.txt removed
FFF/b.txt modified
Upvotes: 2
Reputation: 386362
There are several options to find that will find files that have changed since a given point in time. For example, you could touch
a temporary file at the start of the script, then run find -newer tmpfile
to find all files that have been modified since you touch
ed that temporary file.
Upvotes: 4
Reputation: 468191
Identifying files that have changed between particular states by their hashes (and presence in the directory structure) is essentially what the version control system git does anyway, so why not just use that? Here's a slight modification of your script, which adds the following steps:
git diff
to show the changes between those two commits.The modified script looks like:
#!/bin/bash
# Initialize the current directory as a git repository:
git init
mkdir -p FFF
touch FFF/a.txt
rm -f FFF/b.txt
touch FFF/b.txt
touch FFF/c.txt
rm -f FFF/d.txt
echo "before"
find FFF -name "*.txt" -exec md5sum '{}' \;
echo ""
# Record the state of the directory as a new commit:
git add -A .
git commit -m "Initial state"
# makes some changes that I want to catch
rm -f FFF/a.txt
echo "test" > FFF/b.txt
touch FFF/d.txt
echo "after"
find FFF -name "*.txt" -exec md5sum '{}' \;
# Record the modified state of the directory as a second commit:
git add -A .
git commit -m "New state"
# Output the difference between those two commits:
git diff --name-only HEAD^ HEAD
The output from that script is then:
Initialized empty Git repository in /home/mark/tmp/foobar/.git/
before
d41d8cd98f00b204e9800998ecf8427e FFF/b.txt
d41d8cd98f00b204e9800998ecf8427e FFF/c.txt
d41d8cd98f00b204e9800998ecf8427e FFF/a.txt
[master (root-commit) 8a6d1d9] Initial state
0 files changed, 0 insertions(+), 0 deletions(-)
create mode 100644 FFF/a.txt
create mode 100644 FFF/b.txt
create mode 100644 FFF/c.txt
after
d41d8cd98f00b204e9800998ecf8427e FFF/d.txt
d8e8fca2dc0f896fd7cb4cb0031ba249 FFF/b.txt
d41d8cd98f00b204e9800998ecf8427e FFF/c.txt
[master 810b0f5] New state
2 files changed, 1 insertions(+), 0 deletions(-)
rename FFF/{a.txt => d.txt} (100%)
FFF/a.txt
FFF/b.txt
FFF/d.txt
The last 3 lines are the output from the git diff
command.
Upvotes: 2