ChRapO
ChRapO

Reputation: 337

Get first N occurances of uniq lines, not only one

i have file with rows where are two fields separated by whitespace:

fieldA fieldX
fieldB fieldX
fieldC fieldX
fieldD fieldX
fieldE fieldX
fieldA fieldY
fieldB fieldY
fieldC fieldY

I need to get first N rows of type in second column. What I do is sort -k2 | uniq -f1 --all-repeated=prepend | grep "^$" -A3 which should work but uniq -f1 gives me something different than uniq -f1 --all-repeated=prepend. Do I understand it correctly that prepend should only add emtpy line before unique chunk?

Or is there a better approach?

Thanks

Upvotes: 0

Views: 362

Answers (2)

dogbane
dogbane

Reputation: 274640

No, you're not quite right about prepend.

prepend tells uniq to print a blank file before each chunk of duplicates. Remember that by adding the --all-repeated option you're telling uniq to print only lines which have duplicates i.e. those that occur more than once. It will not print out lines that occur exactly once, like uniq -f1 does.

For example, if you add another line to your file, say, fieldA fieldZ, it will not be output if you have the --all-repeated option because it only occurs in the file once.

Upvotes: 1

twalberg
twalberg

Reputation: 62389

Here's one idea using awk:

awk -v maxlines=<N> ' ++count[$2] <= maxlines { print } '

That will not require sorting the file (but you could still sort it first if there are other reasons you want to...).

Upvotes: 1

Related Questions