Reputation: 232
I would like to make a loop that will take 10 lines of my input file and output it to an output file. And continue to add lines to the output file not over writing it.
This is a sample data:
FilePath Filename Probability ClassifierID HectorFileType LibmagicFileType
/mnt/Hector/Data/benign/binary/benign-pete/ 01d0cd964020a1f498c601f9801742c1 19 S040PDFv02 data.pdf PDF document
/mnt/Hector/Data/benign/binary/benign-pete/ 0299a1771587043b232f760cbedbb5b7 0 S040PDFv02 data.pdf PDF document
I then use this to count each unique file and show how many of each file there is with:
cut -f 4 input.txt|sort| uniq -c | awk '{print $2, $1}' | sed 1d
So ultimately I just need help making a loop that can run that line of bash and output 10 lines of data at a time to an output file
Upvotes: 0
Views: 186
Reputation: 40773
If I understand correctly, for every block of 10 lines, you are trying to:
Here is an AWK script which will do it:
FNR % 10 != 1 {
++count[$4]
}
FNR % 10 == 0 {
for (i in count) {
print i, count[i]
delete count[i]
}
}
FNR % 10 != 1
block processes every line, but lines 1, 11, 21, ... AKA the lines you want to skip. This block keeps a count of field $4FNR % 10 == 0
block prints out a summary for that block and resets (via delete) the countFNR % 10 == 0
with END
.Upvotes: 1