Julie L
Julie L

Reputation: 771

Use awk to remove lines with strings stored in list

Trying to figure out how to store a list as a variable (array?) and use it in with awk.

I have a file like such:

Jimmy
May31
John
June19
Paul
Aug15
Mark
Sept1
David
Nov15

I want to use awk to search my file and remove three names and the line following each of those names. So the final file should only contain 2 names (and birthdays).

I can do this with:

awk '/Jimmy|Mark|David/{n=2}; n {n--; next}; 1' < file

But is there a way to store the "Jimmy|Mark|David" list in the above command as a variable/array and do the same thing. (The real project I've working on has a much longer list to match in a much bigger file).

Thanks!

Upvotes: 0

Views: 2230

Answers (3)

randomir
randomir

Reputation: 18687

You can do it with the -v/--assign option:

awk -v pat='Jimmy|Mark|David' '$0~pat {n=2}; n {n--; next}; 1' birthdays

and then invoke regex comparison manually with ~ operator on the complete line.

Alternatively, if you have a long list of names to filter out in a file, grep with -f would probably be much faster option (see here). For example:

$ cat names
Jimmy
Mark
David

$ paste - - <birthdays | grep -vFf names | tr '\t' '\n'
John
June19
Paul
Aug15

Upvotes: 2

MrE
MrE

Reputation: 20778

Seems like it would be easier to do this:

Patch 2 lines together cat file | paste - -

then use awk to do what you need to do

$ cat list.txt| paste - -                                                                                                                                                                          
Jimmy   May31
John    June19
Paul    Aug15
Mark    Sept1
David   Nov15

Upvotes: 0

MrE
MrE

Reputation: 20778

You can get the list in a variable like this:

LIST=$(cat list.txt | tr "\n" "|")

and then use @randomir 's answer

awk -v pat=$LIST '$0~pat {n=2}; n {n--; next}; 1' birthdays

if I put your list:

Jimmy
John
Paul
Mark
David

into the file list.txt

LIST=$(cat list.txt | tr "\n" "|")

will output

Jimmy|John|Paul|Mark|David

providing you don't add a linebreak at the end of the last line

Upvotes: 0

Related Questions