Nicholas Norris
Nicholas Norris

Reputation: 25

Find file with largest number of lines in single directory

I'm trying to create a function that only outputs the file with the largest number of lines in a directory (and not any sub-directories). I'm being asked to make use of the wc function but don't really understand how to read each file individually and then sort them just to find the largest. Here is what I have so far:

#!/bin/bash

function sort {
[ $# -ne 1 ] && echo "Invalid number of arguments">&2 && exit 1;
[ ! -d "$1" ] && echo "Invalid input: not a directory">&2 && exit 1;
# Insert function here ; 
}

# prompt if wanting current directory
# if yes
    # sort $PWD
    # if no
        #sort $directory

Upvotes: 1

Views: 1284

Answers (3)

pjh
pjh

Reputation: 8164

This solution is almost pure Bash (wc is the only external command used):

shopt -s dotglob    # Include filenames with initial '.' in globs
shopt -s nullglob   # Make globs produce nothing when nothing matches

dir=$1

maxlines=-1
maxfile=

for file in "$dir"/* ; do
    [[ -f $file ]] || continue      # Skip non-files
    [[ -L $file ]] && continue      # Skip symlinks

    numlines=$(wc -l < "$file")

    if (( numlines > maxlines )) ; then
        maxfile=$file
        maxlines=$numlines
    fi
done

[[ -n "$maxfile" ]] && printf '%s\n' "$maxfile"

Remove the shopt -s dotglob if you don't want to process files whose names begin with a dot. Remove the [[ -L $file ]] && continue if you want to process symlinks to files.

This solution should handle all filenames (ones containing spaces, ones containing glob characters, ones beginning with '-', ones containing newlines, ...), but it runs wc for each file so it may be unacceptably slow compared to solutions that feed many files to wc at once if you need to handle directories that have large numbers of files.

Upvotes: 1

oliv
oliv

Reputation: 13259

Use a function like this:

my_custom_sort() {
   for i in "${1+$1/}"*; do 
     [[ -f "$i" ]] && wc -l "$i"
   done | sort -n | tail -n1 | cut -d" " -f2
}

And use it with or without directory (in latter case, it uses the current directory):

my_custom_sort /tmp
helloworld.txt

Upvotes: 0

Ken Y-N
Ken Y-N

Reputation: 15008

How about this:

wc -l * | sort -nr | head -2 | tail -1

wc -l counts lines (you get an error for directories, though), then sort in reverse order treating the first column as a number, then take the first two lines, then the second, as we need to skip over the total line.

wc -l * 2>/dev/null | sort -nr | head -2 | tail -1

The 2>/dev/null throws away all the errors, if you want a neater output.

Upvotes: 1

Related Questions