Reputation: 91
I have a file on the following format:
0.019059000 15150000000
0.037088000 15150000000
0.035007000 15150000001
0.047622000 15150000001
0.053359000 15150000002
0.060405000 15150000002
0.068598000 15150000003
0.081587000 15150000003
I would like to subtract column 1 when column 2 is the same. For example for the input file, i would like to have something like this:
0.018029 15150000000
0.012615 15150000001
0.007046 15150000002
0.012989 15150000003
All the values on the column 2 on the input file go in pairs for example 15150000000 exists only two times, 15150000001 exists only two times etc.
Any help is more than welcome!
Upvotes: 1
Views: 201
Reputation: 67507
awk
to the rescue! (without error checking.)
$ awk 'p==$2 {print $1-pv,p} {p=$2; pv=$1}' file
0.018029 15150000000
0.012615 15150000001
0.007046 15150000002
0.012989 15150000003
for unsorted but again double records for the same key
$ awk '$2 in a {print $1-a[$2],$2; delete a[$2]; next} {a[$2]=$1}' file
0.018029 15150000000
0.012615 15150000001
0.007046 15150000002
0.012989 15150000003
if the second value not always larger than the first one and you want the absolute difference
$ awk 'function abs(x) {return x<0?-x:x}
$2 in a {print abs($1-a[$2]),$2; delete a[$2]; next}
{a[$2]=$1}' file
Upvotes: 4
Reputation: 37404
Another in awk, subtracts smaller from bigger:
$ awk '{
if($2 in a) { # if another $2 already met
print ((s=$1-a[$2])>0?s:-s),$2 # subtract smaller from bigger
delete a[$2] # delete to save memory
} else
a[$2]=$1 # else store $2
}' <(shuf file) # shuf file to demo random order
# replace with just the file name
A sample output (due to shuf
randomness):
0.007046 15150000002
0.018029 15150000000
0.012615 15150000001
0.012989 15150000003
Upvotes: 1
Reputation: 246807
How about
awk '{a[$2] = $1 - a[$2]} END {for (b in a) print a[b], b}' file
Ah, I see you have values in pairs. Go with karakfa's answer then.
Upvotes: 0