JM88
JM88

Reputation: 477

Finding text files with less than 2000 rows and deleting them

I have A LOT of text files, with just one column.

Some text file have 2000 lines (consisting of numbers), and some others have less than 2000 lines (also consisting only of numbers).

I want to delete all the textiles with less than 2000 lines in them.

EXTRA INFO

The files that have less than 2000 lines, are not empty they all have line breaks till row 2000. Plus my files have some complicated names like: Nameofpop_chr1_window1.txt

I tried using awk to first count the lines of my text file, but because there are line breaks for every file I get the same result, 2000 for every file.

awk 'END { print NR }' Nameofpop_chr1_window1.txt

Thanks in advance.

Upvotes: 3

Views: 379

Answers (3)

John B
John B

Reputation: 3646

You can use Bash:

for f in $files; do
    n=0
    while read line; do
        [[ -n $line ]] && ((n++))
    done < $f
    [ $n -lt 2000 ] && rm $f
done

Upvotes: 0

Saddam Abu Ghaida
Saddam Abu Ghaida

Reputation: 6749

you can use expr $(cat filename|sort|uniq|wc -l) - 1 or cat filename|grep -v '^$'|wc -l it will give you the number of lines per file and based on that you decidewhat to do

Upvotes: 0

anubhava
anubhava

Reputation: 785601

You can use this awk to count non-empty lines:

awk 'NF{i++} END { print i }' Nameofpop_chr1_window1.txt

OR this awk to count only those lines that have only numbers

awk '/^[[:digit:]]+$/ {i++} END { print i }' Nameofpop_chr1_window1.txt

To delete all files with less than 2000 lines with numbers use this awk:

for f in f*; do
    [[ -n $(awk '/^[[:digit:]]+$/{i++} END {if (i<2000) print FILENAME}' "$f") ]] && rm "$f"
done

Upvotes: 4

Related Questions