Reputation: 883
I have a file that looks like this: ( Note : A*, B*, C* are placeholders). The file is delimited by ;
AAAA;BBBB;CCCCCCCC;DD;EEEEEEEE;FF;
AAA1;BBBBB;CCCC;DD;EEEEEEEE;FFFFF;
AAA3;BB;CCCC;DDDDDDDDD;EEEEEEE;FF;
I m trying to write a small script that counts the number of occurrences of the delimiter ;
and if it is lesser or greater than 5, output said line to a text file.
delim=";"
while read line
do
n_of_occ=$(grep -o "$delim" <<< "$line" | wc -l)
if [[ $n_of_occ < 5 ]] || [[ $n_of_occ > 5 ]]
then
echo $line >> outfile
fi
done
For some reason, this doesn't seem to work and my output is garbled. Could someone assist or provide a different way to tackle this? Perhaps with Perl instead of bash?
Upvotes: 2
Views: 2075
Reputation: 126722
Unfortunately every line in your sample data has six semicolons, which means they should all be printed. However, here is a one-line Perl solution
$ perl -ne'print if tr/;// != 5' aaa.csv
AAAA;BBBB;CCCCCCCC;DD;EEEEEEEE;FF;
AAA1;BBBBB;CCCC;DD;EEEEEEEE;FFFFF;
AAA3;BB;CCCC;DDDDDDDDD;EEEEEEE;FF;
Upvotes: 1
Reputation: 22428
With sed you can do this:
sed '/^\([^;]*;\)\{5\}$/d' file > outfile
It deletes the lines with exactly 5 commas (;
) and sends the output to outfile.
done
with done <file
[[
with ((
and ]]
with ))
i.e use ((...))
instead of [[...]]
Upvotes: 1
Reputation: 241898
Easy in Perl:
perl -ne 'print if tr/;// != 5' input_file > output_file
-n
reads the input line by linetr
operator returns the number of matchesUpvotes: 1
Reputation: 14955
This is ridiculous easy with awk
:
awk -F\; 'NF!=6' file > outfile
Upvotes: 3