Reputation: 32276
I need to check the following data and report the number of rows those do not match a given criterion.
set 582:1960:4c31ed7dea 2012-03-10~23:55:00\r\n
set 565:388:13c10fd316 2012-03-10~23:55:00\r\n
set 519:348:361189d4b9 extra_text 2012-03-10~23:55:00\r\n
set 498:5634:6047172ecc 2012-03-10~23:55:00\r\n
set 565:0:bf7a80ee4f 2012-03-10~23:55:00
1) All lines should start with the word "set" and end with "\r\n"
2) All lines should have exact 3 number of fields delimited by space.
In the example data, it should return the invalid row count: 2 and preferably the entire line. The third line has an extra word and fifth line does not end correctly.
Upvotes: 0
Views: 55
Reputation: 56129
awk
is good for this. A fairly full-featured script:
#!/usr/bin/awk -f
BEGIN {ends = fields = total = 0 }
NF != 3 || !/\r$/ {
total++
if(NF != 3) fields++
if(!/\r$/) ends++
print
}
END {
printf "Wrong number of fields: " fields
printf "Did not end in a CR: " ends
printf "Total: " total
}
Short one-liner, only prints offending lines:
awk 'NF != 3 || !/\r$/' file
Prints and counts total:
awk 'NF!=3||!/\r$/{total++} END{print "Total: " total}
Upvotes: 1
Reputation: 183582
To print the invalid rows:
grep -v '^set [^ ][^ ]* [^ ][^ ]*\\r\\n$' FILENAME
To print the number of invalid rows:
grep -cv '^set [^ ][^ ]* [^ ][^ ]*\\r\\n$' FILENAME
Upvotes: 1