Dorian
Dorian

Reputation: 159

Show lines which does not contain specific string on Linux

I have a text file on my Linux server with these characters:

  ID              DATA
MF00034657,12435464^DRogan^DPUM-DT_MAX_1234;PUM-DT_MAX_1234;PUM-DT_MAX_1234;PUM-DT_MAX_1234;PUM-DT_MAX_1234;M-DT_MAX_1;
MF00056578,12435464^DRogan^DPUM-DT_MAX_1234;PUM-DT_MAX_1234;PUM-DT_MAX_1234;PUM-DT_MAX_1234;PUM-DT_MAX_1234;UM-DT_MAX_123;

Now I need to filter the lines which do not contain "PUM-DT_MAX_1234" and save them in another file with the ID.

Like this:

MF00034657,M-DT_MAX_1
MF00056578,UM-DT_MAX_123

I use:

grep -v 'PUM-DT_MAX_1234' file > file.out
awk '!/PUM-DT_MAX_1234/' file > file.out

But it doesn’t work.

How can I fix it?

Upvotes: 10

Views: 40226

Answers (5)

voiger
voiger

Reputation: 910

Use:

awk '$0 !~ /your_pattern/'

As found in the (probably) greatest AWK documentation.

Upvotes: 39

NeronLeVelu
NeronLeVelu

Reputation: 10039

sed '1b
h;s/.*DRogan^D//;s/PUM-DT_MAX_1234;\{0,1\}//g;s/;$//;/./!d
H;g;s/,.*\n/,/' YourFile
  • based on your sample

Concept:

  • keep a copy of the line
  • remove head and any "PUM" from the line. Check if something stay
  • get back the header (from the buffered line) and reformat with the reduce line

Upvotes: 1

Tensibai
Tensibai

Reputation: 15784

If you wish to remove any field containing "PUM-DT_MAX_1234" then you have to iterate over each field in your line:

awk -F "[;,]" -v OFS="," 'NR==1 { next; }; { for (i=1;i<=NF;i++) { if(!match($i,/.*PUM-DT_MAX_1234.*/) && length($i) > 0) { if (i==1) r=$i;  else r = r OFS $i }}; print r }' filter.txt

In a more readable view with comments:

  • -F "[;,]" Set the field separator to be ; or ,
  • -v OFS="," Set the output separator to be ,
  • 'NR==1 { next; }; ' start of the AWK script. The rest is to skip the header of your file (if the record number is 1, stop and go to to the next line
  • { for (i=1;i<=NF;i++) { Iterate over the number of fields (NF)
  • if(!match($i,/.*PUM-DT_MAX_1234.*/) && length($i) > 0) { If the field is not null and don't match the text
  • if (i==1) r=$i; else r = r OFS $i concatenate the field to previous one (or just set it to the first field to avoid a leading , in the output)
  • print r }' Once the loop ends, print the result of the previous concatenation, and end the AWK script with ' for the shell
  • filter.txt Last argument is the file name.

OFS is the Output Field Separator, so you can change it by changing the variable on the command line.

Output from your example:

MF00034657,M-DT_MAX_1
MF00056578,UM-DT_MAX_123

Upvotes: 3

Syed Faraz Umar
Syed Faraz Umar

Reputation: 438

In silgon's answer, the command worked after I removed the gap in '! /.mp4/'

  • I wanted to remove "none" images from 'docker images' output, using AWK:

docker images | awk '!/\<none>/'

  • I wanted to print the name and tag only from 'docker images' output, i.e., column 1 and 2 from an output excluding "none" images as well, using AWK:

docker images | awk '!/\<none>/' | awk '{print $1,$2}'

Upvotes: 0

silgon
silgon

Reputation: 7191

I'll use an analogy of your problem with the command ls (because it is easy to implement), let's say I want to display all files that are not mp4, you do the following:

ls | awk '! /\.mp4/'

If you want to go further with the options, I could be actually looking for a file that it does not contain the mp4 extension and it does contain an specific string, e.g. abc:

ls | awk '! /\.mp4/ &&  /abc/'

This should be analogous and applicable to your purposes (or at least, not hard to implement).

Upvotes: 4

Related Questions