castis
castis

Reputation: 8223

Find files not in numerical list

I have a giant list of files that are all currently numbered in sequential order with different file extensions.

3400.PDF
3401.xls
3402.doc

There are roughly 1400 of these files in a directory. What I would like to know is how to find numbers that do not exist in the sequence.

I've tried to write a bash script for this but my bash-fu is weak.

I can get a list of the files without their extensions by using

FILES=$(ls -1 | sed -e 's/\..*$//')

but a few places I've seen say to not use ls in this manner. (15 days after asking, I couldn't relocate where I read this, if it existed at all...)

I can also get the first file via ls | head -n 1 but Im pretty sure I'm making this a whole lot more complicated that I need to.

Upvotes: 0

Views: 88

Answers (3)

jthill
jthill

Reputation: 60275

ls [0-9]* \
| awk -F. '  !seen[$1]++ { ++N }
             END         { for (n=1; N ; ++n) if (!seen[n]) print n; else --N }
'

Will stop when it's filled the last gap, sub in N>0 || n < 3000 to go at least that far.

Upvotes: 0

Thomas
Thomas

Reputation: 181745

Based on deleted answer that was largely correct:

for i in $(seq 1 1400); do ls $i.* > /dev/null 2>&1 || echo $i; done

Upvotes: 1

Tom Fenech
Tom Fenech

Reputation: 74605

Sounds like you want to do something like this:

shopt -s nullglob
for i in {1..1400}; do 
    files=($i.*)
    (( ${#files[@]} > 0 )) || echo "no files beginning with $i"; 
done

This uses a glob to make an array of all files 1.*, 2.* etc. It then compares the length of the array to 0. If there are no files matching the pattern, the message is printed.

Enabling nullglob is important as otherwise, when there are no files matching the array will contain one element: the literal value '1.*'.

Upvotes: 3

Related Questions