Reputation: 529
I have a very large file (2.5M record) with 2 columns seperated by |. I would like to filter all record that do not contain the value "-1" inside the second column and write it into a new file.
I tried to use:
grep -v "-1" norm_cats_21_07_assignments.psv > norm_cats_21_07_assignments.psv
but noo luck.
Upvotes: 0
Views: 134
Reputation: 75478
You can have:
awk -F'|' '$2 != "-1"' file.psv > new_file.psv
Or
awk -F'|' '$2 !~ /-1/' file.psv > new_file.psv
!=
matches the whole column while !~
needs only a part of it.Edit: Just noticed that your input file and output file are the same. You can't do that as the output file which is the same file would get truncated even before awk
starts reading it.
With awk
after making the new filtered file (e.g. new_file.psv
), you can save it back by using cat new_file.psv > file.psv
or mv new_file.psv file.psv
.
But somehow if you exactly have 2 columns separated with |
and no spaces in between, and no quotes around, etc. You can just use inline editing with sed
:
sed -i '/|-1/d' file.psv
Or perhaps something equivalent to awk -F'|' '$2 !~ /-1/'
:
sed -i '/|.*-1/d' file.psv
Upvotes: 0
Reputation: 116108
For quick and dirty solution, you can simply add |
to your grep:
grep -v "|-1" input.psv > output.psv
This assumes that rows to be ignored look like
something|-1
Note that if you ever need to use grep -v "-1"
, you have to add --
after options, otherwise grep will treat -1
as an option, something like this:
grep -v -- "-1"
Upvotes: 1
Reputation: 174696
You could do this through awk,
awk -F"|" '$2~/^-1$/{next}1' file > newfile
Example:
$ cat r
foo|-1
foo|bar
$ awk -F"|" '$2~/^-1$/{next}1' r
foo|bar
Upvotes: 0