C.A.R.
C.A.R.

Reputation: 187

SED to spit out nth and (n+1)th lines

EDITS: For reference, "stuff" is a general variable, as is "KEEP". KEEP could be "Hi, my name is Dave" on line 2 and "I love pie" on line 7. The numbers I've put here are for illustration only and DO NOT show up in the data.

I had a file that needed to be parsed, keeping every 4th line, starting at the 3rd line. In other words, it looked like this:

1 stuff
2 stuff
3 KEEP
4 
5 stuff
6 stuff
7 KEEP
8 stuff etc...

Great, sed solved that easily with:

sed -n -e 3~4p myfile

giving me

3 KEEP
7 KEEP
11 KEEP

Now I have a different file format and a different take on the pattern:

1 stuff
2 KEEP
3 KEEP
4
5 stuff
6 KEEP
7 KEEP etc...

and I still want the output of

2 KEEP
3 KEEP
6 KEEP
7 KEEP
10 KEEP
11 KEEP

Here's the problem - this is a multi-pattern "pattern" for sed. It's "every 4th line, spit out 2 lines, but start at line 2".

Do I need to have some sort of DO/FOR loop in my sed, or do I need a different command like awk or grep? Thus far, I have tried formats like:

sed -n -e '3~4p;4~4p' myfile

and

awk 'NR % 3 == 0 || NR % 4 ==0' myfile

and

sed -n -e '3~1p;4~4p' myfile

and

awk 'NR % 1 == 0 || NR % 4 ==0' myfile

source: https://superuser.com/questions/396536/how-to-keep-only-every-nth-line-of-a-file

Upvotes: 0

Views: 353

Answers (5)

potong
potong

Reputation: 58371

This might work for you (GNU sed):

sed '2~4,+1p;d' file

Use a range, the first parameter is the starting line and modulus (in this case from line 2 modulus 4). The second parameter is how man lines following the start of the range (in this case plus one). Print these lines and delete all others.

Upvotes: 1

kvantour
kvantour

Reputation: 26471

In the generic case, you want to keep lines p to p+q and p+n to p+q+n and p+2n to p+q+2n ... So you can write:

awk '(NR - p) % n <= q'

Upvotes: 0

karakfa
karakfa

Reputation: 67467

this is the idiomatic way to write in awk

$ awk 'NR%4==2 || NR%4==3' file

however, this special case can be shortened to

$ awk 'NR%4>1' file

Upvotes: 1

dawg
dawg

Reputation: 103744

If your intent is to print lines 2,3 then every fourth line after those two, you can do:

$ seq 20 | awk 'BEGIN{e[2];e[3]} (NR%4) in e'
2
3
6
7
10
11
14
15
18
19

Upvotes: 3

PesaThe
PesaThe

Reputation: 7499

You were pretty close with your sed:

$ printf '%s\n' {1..12} | sed -n '2~4p;3~4p'
2
3
6
7
10
11

Upvotes: 1

Related Questions