Reputation: 613
I have a file CASE.dat file
# X Y Z TARGET MY DIST MY DATA
--------------------------------------------------------------------------------
1 16.136051 19.214215 26.195842 0.935901 0.528294 10305.052469
2 19.296614 20.459830 20.711839 4.033354 1.152114 258.468669
3 21.757247 20.010601 21.609096 4.008830 1.117961 208.482335
4 23.340579 20.230572 20.299311 0.962172 0.567720 1648.046276
5 22.232850 19.276643 24.105109 4.028086 1.105535 116.818198
6 20.177439 18.995924 25.744873 4.020979 1.119227 259.240957
7 20.507640 18.422719 27.698151 0.973875 0.578381 4433.058006
8 17.718280 19.441795 24.896309 4.052598 1.117063 399.224573
9 17.274647 20.170761 22.411821 4.049756 1.067280 369.719958
10 15.344147 20.532170 21.791338 0.942252 0.522218 2903.487129
11 16.747362 21.490591 16.828061 4.119692 1.052854 640.628897
12 18.942734 21.191117 18.059497 4.016967 1.013168 370.875172
13 16.713317 22.043861 14.846116 0.952206 0.572128 15824.211118
14 14.917097 21.194983 17.726730 0.996560 0.573948 8439.378683
15 20.697846 21.496657 17.007974 0.931434 0.494488 4811.530560
16 24.891192 18.784856 25.017254 4.004345 1.086042 87.628933
17 24.849590 17.270757 26.442292 0.986123 0.548764 2084.437203
18 26.020588 18.043376 23.429171 0.962405 0.489209 5797.201598
19 29.699839 22.572565 28.810307 4.025628 1.079363 339.526719
20 31.243469 22.179022 30.120360 0.974974 0.569833 5998.952157
21 29.172195 25.093904 28.162412 3.991001 1.124966 301.999963
My aim is to do some processing on column number 5.
I extract it using below script
cat CASE.dat | awk '{print $5}' | awk NF | awk 'NR>1'
this gives me
0.935901
4.033354
4.008830
0.962172
4.028086
4.020979
0.973875
4.052598
4.049756
0.942252
4.119692
4.016967
0.952206
0.996560
0.931434
4.004345
0.986123
0.962405
4.025628
0.974974
3.991001
Now I need a advice which can improve my above script.
Further, I have two types of number here, one is ~4
and another is ~1
. I want to add 2.0
into all the numbers which are ~4
and 1.0
in all the numbers which are ~1
.
Please suggest any simple answer.
Upto this result should be
1.935901
6.033354
6.008830
1.962172
6.028086
6.020979
1.973875
6.052598
6.049756
1.942252
6.119692
6.016967
1.952206
1.996560
1.931434
6.004345
1.986123
1.962405
6.025628
1.974974
5.991001
Finally, I want to subtract the number which is ~6
from 6 (this number may vary in another file) and which is ~1
from 2 (this number may vary in another file).
The final data should be
0.064099
-0.033354
-0.00883
0.037828
-0.028086
-0.020979
0.026125
-0.052598
-0.049756
0.057748
-0.119692
-0.016967
0.047794
0.00344
0.068566
-0.004345
0.013877
0.037595
-0.025628
0.025026
0.008999
Upvotes: 0
Views: 72
Reputation: 6090
Here you go:
import math
import numpy as np
with open("CASE.dat", "r") as msg:
data = msg.readlines()
for i, line in enumerate(data[2:]):
row = list(map(float, line.strip().split()))
if round(row[4]) == 1:
val = 1
elif round(row[4]) == 4:
val = 2
row[4] = row[4] + val
if round(row[4]) == 6:
row[4] = 6 - row[4]
elif round(row[4]) == 2:
row[4] = np.abs(row[4] - 2)
data[i+2] = " ".join(map(str,row))
for row in data:
print (row)
You get:
# X Y Z TARGET MY DIST MY DATA
--------------------------------------------------------------------------------
1.0 16.136051 19.214215 26.195842 0.06409900000000013 0.528294 10305.052469
2.0 19.296614 20.45983 20.711839 -0.033354000000000106 1.152114 258.468669
3.0 21.757247 20.010601 21.609096 -0.008829999999999671 1.117961 208.482335
4.0 23.340579 20.230572 20.299311 0.03782799999999997 0.56772 1648.046276
5.0 22.23285 19.276643 24.105109 -0.028086000000000055 1.105535 116.818198
6.0 20.177439 18.995924 25.744873 -0.020978999999999637 1.119227 259.240957
7.0 20.50764 18.422719 27.698151 0.026124999999999954 0.578381 4433.058006
8.0 17.71828 19.441795 24.896309 -0.0525979999999997 1.117063 399.224573
9.0 17.274647 20.170761 22.411821 -0.049756000000000355 1.06728 369.719958
10.0 15.344147 20.53217 21.791338 0.05774800000000013 0.522218 2903.487129
11.0 16.747362 21.490591 16.828061 -0.11969199999999969 1.052854 640.628897
12.0 18.942734 21.191117 18.059497 -0.016967000000000176 1.013168 370.875172
13.0 16.713317 22.043861 14.846116 0.047794000000000114 0.572128 15824.211118
14.0 14.917097 21.194983 17.72673 0.0034399999999998876 0.573948 8439.378683
15.0 20.697846 21.496657 17.007974 0.06856600000000013 0.494488 4811.53056
16.0 24.891192 18.784856 25.017254 -0.004344999999999821 1.086042 87.628933
17.0 24.84959 17.270757 26.442292 0.013876999999999917 0.548764 2084.437203
18.0 26.020588 18.043376 23.429171 0.037595000000000045 0.489209 5797.201598
19.0 29.699839 22.572565 28.810307 -0.025628000000000206 1.079363 339.526719
20.0 31.243469 22.179022 30.12036 0.025025999999999993 0.569833 5998.952157
21.0 29.172195 25.093904 28.162412 0.008999000000000201 1.124966 301.999963
Upvotes: 0
Reputation: 786021
You nay use this awk
:
awk -v d='0.009' 'NR <= 2 {next} {n = int($5+d)} n == 4 {$5 += 2} n == 1 {$5 += 1} {n = int($5+d)} n==6 || n==1 {$5 = n - $5} {print $5}' case.dat
0.935901
-0.033354
-0.00883
0.962172
-0.028086
-0.020979
0.973875
-0.052598
-0.049756
0.942252
-0.119692
-0.016967
0.952206
1.99656
0.931434
-0.004345
0.986123
0.962405
-0.025628
0.974974
0.008999
A more readable format:
awk -v d='0.009' 'NR <= 2 { next }
{n = int($5+d)}
n == 4 {$5 += 2}
n == 1 {$5 += 1}
{n = int($5+d)}
n == 6 || n == 1 {
$5 = n - $5
}
{print $5}' case.dat
Upvotes: 1