Reputation: 25986
I have a file structure that looks like this
./501.res/1.bin
./503.res/1.bin
./503.res/2.bin
./504.res/1.bin
and I would like to find the file path to the .bin
file in each directory which have the highest number as filename. So the output I am looking for would be
./501.res/1.bin
./503.res/2.bin
./504.res/1.bin
The highest number a file can have is 9.
Question
How do I do that in BASH?
I have come as far as find .|grep bin|sort
Upvotes: 7
Views: 2817
Reputation: 1837
Tested:
find . -type d -name '*.res' | while read dir; do
find "$dir" -maxdepth 1 | sort -n | tail -n 1
done
Upvotes: 2
Reputation: 46846
What about using awk
? You can get the FIRST occurrence really simply:
[ghoti@pc ~]$ cat data1
./501.res/1.bin
./503.res/1.bin
./503.res/2.bin
./504.res/1.bin
[ghoti@pc ~]$ awk 'BEGIN{FS="."} a[$2] {next} {a[$2]=1} 1' data1
./501.res/1.bin
./503.res/1.bin
./504.res/1.bin
[ghoti@pc ~]$
To get the last occurrence you could pipe through a couple of sorts:
[ghoti@pc ~]$ sort -r data1 | awk 'BEGIN{FS="."} a[$2] {next} {a[$2]=1} 1' | sort
./501.res/1.bin
./503.res/2.bin
./504.res/1.bin
[ghoti@pc ~]$
Given that you're using "find" and "grep", you could probably do this:
find . -name \*.bin -type f -print | sort -r | awk 'BEGIN{FS="."} a[$2] {next} {a[$2]=1} 1' | sort
How does this work?
The find
command has many useful options, including the ability to select your files by glob, select the type of file, etc. Its output you already know, and that becomes the input to sort -r
.
First, we sort our input data in reverse (sort -r
). This insures that within any directory, the highest numbered file will show up first. That result gets fed into awk. FS is the field separator, which makes $2
into things like "/501", "/502", etc. Awk scripts have sections in the form of condition {action}
which get evaluated for each line of input. If a condition is missing, the action runs on every line. If "1" is the condition and there is no action, it prints the line. So this script is broken out as follows:
a[$2] {next}
- If the array a
with the subscript $2 (i.e. "/501") exists, just jump to the next line. Otherwise...{a[$2]=1}
- set the array a subscript $2 to 1, so that in future the first condition will evaluate as true, then...1
- print the line.The output of this awk script will be the data you want, but in reverse order. The final sort
puts things back in the order you'd expect.
Now ... that's a lot of pipes, and sort can be a bit resource hungry when you ask it to deal with millions of lines of input at the same time. This solution will be perfectly sufficient for small numbers of files, but if you're dealing with large quantities of input, let us know, and I can come up with an all-in-one awk solution (that will take longer than 60 seconds to write).
UPDATE
Per Dennis' sage advice, the awk script I included above could be improved by changing it from
BEGIN{FS="."} a[$2] {next} {a[$2]=1} 1
to
BEGIN{FS="."} $2 in a {next} {a[$2]} 1
While this is functionally identical, the advantage is that you simply define array members rather than assigning values to them, which may save memory or cpu depending on your implementation of awk. At any rate, it's cleaner.
Upvotes: 2
Reputation: 66488
And here is one liner
find . -mindepth 1 -type d | sort | sed -e "s/.*/ls & | sort | tail -n 1 | xargs -I{} echo &\/{}/" | bash
Upvotes: 0
Reputation: 360035
Globs are guaranteed to be expanded in lexical order.
for dir in ./*/
do
files=($dir/*) # create an array
echo "${files[@]: -1}" # access its last member
done
Upvotes: 3
Reputation: 7468
If invoking a shell from within find is an option try this
find * -type d -exec sh -c "echo -n './'; ls -1 {}/*.bin | sort -n -r | head -n 1" \;
Upvotes: 0
Reputation: 66488
I came up with someting like this:
for dir in $(find . -mindepth 1 -type d | sort); do
file=$(ls "$dir" | sort | tail -n 1);
[ -n "$file" ] && (echo "$dir/$file");
done
Maybe it can be simpler
Upvotes: 1