Reputation: 1913
I am trying to find if each line of output from running ls -al
is a file or
directory and whether or not it is hidden and count the type of each.
EDIT: It is imperative that I must not use find
.
#!/bin/bash
#declare four different regex statements that match files, hidden files, directories and hidden directories (excluding . and ..)
#based on the output of each line of running ls -al
re_file='^\-[rwx\-]{9}\s[0-9]+\s([a-z_][a-z0-9_]{0,30})\s([a-z_][a-z0-9_]{0,30})\s[0-9]+\s\w{3}\s[0-9]+\s[0-9]{2}:[0-9]{2}\s[^\.](\w|\.)*$'
re_hidden_file='^\-[rwx\-]{9}\s[0-9]+\s([a-z_][a-z0-9_]{0,30})\s([a-z_][a-z0-9_]{0,30})\s[0-9]+\s\w{3}\s[0-9]+\s[0-9]{2}:[0-9]{2}\s\.\w(\w|\.)*$'
re_directory='^d[rwx\-]{9}\s[0-9]+\s([a-z_][a-z0-9_]{0,30})\s([a-z_][a-z0-9_]{0,30})\s[0-9]+\s\w{3}\s[0-9]+\s[0-9]{2}:[0-9]{2}\s[^\.](\w|\.)*$'
re_hidden_directory='^d[rwx\-]{9}\s[0-9]+\s([a-z_][a-z0-9_]{0,30})\s([a-z_][a-z0-9_]{0,30})\s[0-9]+\s\w{3}\s[0-9]+\s[0-9]{2}:[0-9]{2}\s\.\w(\w|\.)*$'
#declare four different counters for each type
file_count=0
hidden_file_count=0
directory_count=0
hidden_directory_count=0
#read through the output of ls -al line by line, assigning x the value of each line
ls -al $1 | while read x; do
#test if each line matches each of the regex statements, if it does then increment the relevant counter
if [[ $x =~ $re_file ]] ; then
file_count+=1
elif [[ $x =~ $re_hidden_file ]] ; then
hidden_file_count+=1
elif [[ $x =~ $re_directory ]] ; then
directory_count+=1
elif [[ $x =~ $re_hidden_directory ]] ; then
hidden_directory_count+=1
else
echo "!!!"
fi
done
total=$((file_count + hidden_file_count + directory_count + hidden_directory_count))
echo "Files found: $file_count (plus $hidden_file_count hidden)"
echo "Directories found: $directory_count (plus $hidden_directory_count hidden)"
echo "Total files and directories: $total"
Currently the script outputs the !!!
from not matching any of the Regex statements for each line of ls -al
and all of the counter variables remain at 0
. Here's an example of the input (though Bash removes the extra spaces used for padding before the Regex checks are done).
drwx--x--x 37 username groupname 4096 Jan 8 14:37 .
drwxr-xr-x 235 root root 4096 Nov 15 12:16 ..
drwx------ 3 username groupname 4096 Oct 27 14:35 .adobe
-rw------- 1 username groupname 14458 Dec 5 20:24 .bash_history
-rw------- 1 username groupname 2680 Sep 30 16:12 .bash_profile
-rw------- 1 username groupname 1210 Oct 7 09:40 .bashrc
drwx------ 12 username groupname 4096 Dec 6 15:24 .cache
drwxr-xr-x 17 username groupname 4096 Jan 8 14:37 .config
drwx------ 4 username groupname 4096 Dec 5 17:51 dir1
drwx------ 2 username groupname 4096 Nov 23 12:26 dir2
...
I have tested the Regex on an online Regex checker and they evaluate as I would like them to. I assume this is a Bash-specific problem. Any help is appreciated.
Upvotes: 1
Views: 199
Reputation: 104062
You should not parse ls
to get files. Use find
instead with nul termination or globbing.
The problem is that ls
produces ambiguous output for file names that are otherwise legal file names. Consider:
$ touch a$'\t'b
$ touch a$'\n'b
$ ls -l a*
-rw-r--r-- 1 andrew wheel 0 Jan 8 08:25 a?b
-rw-r--r-- 1 andrew wheel 0 Jan 8 08:26 a?b
The unprintable characters of \t
and \n
are replaced with ?
and render those files from ls
ambiguous.
The same will happen with trailing spaces:
$ touch "a b c "
$ touch "a b c "
$ ls -al a\ b*
-rw-r--r-- 1 andrew wheel 0 Jan 8 08:44 a b c
-rw-r--r-- 1 andrew wheel 0 Jan 8 08:44 a b c
Now consider using find
:
$ find . -name "a*" -maxdepth 1 -print0 | xargs -0 printf "'%s'\n"
'./a b'
'./a
b'
'./a b c '
'./a b c '
Or just globbing:
$ for fn in a*; do printf "'%s'\n" "$fn"; done
'a b'
'a
b'
'a b c '
'a b c '
If you want to get total directories and total files including hidden files and directories just add that to your glob pattern:
file_count=0
hidden_file_count=0
regular_directory_count=0
hidden_directory_count=0
echo "=====regular files and directories:"
for fn in *; do
printf "'%s'\n" "$fn"
if [ -d "$fn" ]; then
regular_directory_count=$((regular_directory_count+1))
else
file_count=$((file_count+1))
fi
done
echo "====hidden files and direcotries:"
for fn in .*; do
printf "'%s'\n" "$fn";
if [ -d "$fn" ]; then
hidden_directory_count=$((hidden_directory_count+1))
else
hidden_file_count=$((hidden_file_count+1))
fi
done
printf "Regular files: %s regular directories: %s\n" $file_count $regular_directory_count
printf "Hidden files: %s hidden directories: %s\n" $hidden_file_count $hidden_directory_count
tf=$((hidden_file_count+file_count))
td=$((hidden_directory_count+regular_directory_count))
printf "Total files: %s total directories: %s\n" $tf $td
Given:
$ ls -la
total 0
drwxr-xr-x 9 andrew wheel 306 Jan 8 11:07 .
drwxrwxrwt 92 root wheel 3128 Jan 8 10:58 ..
drwxr-xr-x 2 andrew wheel 68 Jan 8 11:07 .hidden dir
-rw-r--r-- 1 andrew wheel 0 Jan 8 11:26 .hidden file
-rw-r--r-- 1 andrew wheel 0 Jan 8 11:26 a?b
-rw-r--r-- 1 andrew wheel 0 Jan 8 11:26 a?b
-rw-r--r-- 1 andrew wheel 0 Jan 8 11:26 a b c
-rw-r--r-- 1 andrew wheel 0 Jan 8 11:26 a b c
drwxr-xr-x 2 andrew wheel 68 Jan 8 11:07 regular dir
Run that and you get:
=====regular files and directories:
'a b'
'a
b'
'a b c '
'a b c '
'regular dir'
====hidden files and direcotries:
'.'
'..'
'.hidden dir'
'.hidden file'
Regular files: 4 regular directories: 1
Hidden files: 1 hidden directories: 3
Total files: 5 total directories: 4
If you want to exclude .
and ..
hidden directories you can set GLOBIGNORE=".:.."
prior to using the .*
glob pattern.
Upvotes: 2
Reputation: 140276
Took me a while but got it to work.
My approach: avoid parsing the output of ls -l
. Specially here you don't need it. Enable options so *
in for
loop sees hidden objects and test each object against object type (using shopt
).
Also: a+=1
doesn't do what you think it does. It just appends 1
at the end of the string!
#!/bin/bash
#declare four different regex statements that match files, hidden files, directories and hidden directories (excluding . and ..)
#based on the output of each line of running ls -al
re_hidden_file='^\..*'
#declare four different counters for each type
file_count=0
hidden_file_count=0
directory_count=0
hidden_directory_count=0
# enable hidden files/directories
shopt -s dotglob
#read through the output of ls -al line by line, assigning x the value of each line
for x in * ; do
#test if each line matches each of the regex statements, if it does then increment the relevant counter
if [ -d "$x" ] ; then
if [[ "$x" =~ $re_hidden_file ]] ; then
hidden_directory_count=$((hidden_directory_count+1))
else
directory_count=$((directory_count+1))
fi
else
if [[ "$x" =~ $re_hidden_file ]] ; then
hidden_file_count=$((hidden_file_count+1))
else
file_count=$((file_count+1))
fi
fi
done
total=$((file_count + hidden_file_count + directory_count + hidden_directory_count))
echo "Files found: $file_count (plus $hidden_file_count hidden)"
echo "Directories found: $directory_count (plus $hidden_directory_count hidden)"
echo "Total files and directories: $total"
Upvotes: 2