user3628777
user3628777

Reputation: 529

Linux awk with condition

I have a very large file (2.5M record) with 2 columns seperated by |. I would like to filter all record that do not contain the value "-1" inside the second column and write it into a new file.

I tried to use:

grep -v "-1" norm_cats_21_07_assignments.psv > norm_cats_21_07_assignments.psv

but noo luck.

Upvotes: 0

Views: 134

Answers (3)

konsolebox
konsolebox

Reputation: 75478

You can have:

awk -F'|' '$2 != "-1"' file.psv > new_file.psv

Or

awk -F'|' '$2 !~ /-1/' file.psv > new_file.psv
  • != matches the whole column while !~ needs only a part of it.

Edit: Just noticed that your input file and output file are the same. You can't do that as the output file which is the same file would get truncated even before awk starts reading it.

With awk after making the new filtered file (e.g. new_file.psv), you can save it back by using cat new_file.psv > file.psv or mv new_file.psv file.psv.

But somehow if you exactly have 2 columns separated with | and no spaces in between, and no quotes around, etc. You can just use inline editing with sed:

sed -i '/|-1/d' file.psv

Or perhaps something equivalent to awk -F'|' '$2 !~ /-1/':

sed -i '/|.*-1/d' file.psv

Upvotes: 0

mvp
mvp

Reputation: 116108

For quick and dirty solution, you can simply add | to your grep:

grep -v "|-1" input.psv > output.psv

This assumes that rows to be ignored look like

something|-1

Note that if you ever need to use grep -v "-1", you have to add -- after options, otherwise grep will treat -1 as an option, something like this:

grep -v -- "-1"

Upvotes: 1

Avinash Raj
Avinash Raj

Reputation: 174696

You could do this through awk,

awk -F"|" '$2~/^-1$/{next}1' file > newfile

Example:

$ cat r
foo|-1
foo|bar
$ awk -F"|" '$2~/^-1$/{next}1' r
foo|bar

Upvotes: 0

Related Questions