Reputation: 138
I have two numpy arrays a and b. I want to compare two numpy array by column 0 which contains time series data.I want to create a new numpy array with sorted time series in column 0 and their associated values with time. and if no values is found insert null to the data. example
a = np.array([[0.002, 0.998],
[0.004, 0.997],
[0.006, 0.996],
[0.008, 0.995],
[0.010, 0.993]])
b = np.array([[0.002, 0.666],
[0.004, 0.665],
[0.0041, 0.664],
[0.0041,0.663],
[0.0042, 0.664],
[0.0043, 0.664],
[0.0044, 0.663],
[0.0045, 0.663],
[0.005, 0.663],
[0.006, 0.663],
[0.0061, 0.662],
[0.008, 0.661]])
Expected Ouput
c= [[0.002, 0.998, 0.666],
[0.004, 0.997, 0.665],
[0.0041, null, 0.664],
[0.0041, null ,0.663],
[0.0042, null, 0.664],
[0.0043, null, 0.664],
[0.0044, null, 0.663],
[0.0045, null, 0.663],
[0.005, null, 0.663],
[0.006, 0.996, 0.663],
[0.0061, null, 0.662],
[0.008, 0.995, 0.661],
[0.010, 0.993, null]]
Any tricks to do like this...
Upvotes: 0
Views: 241
Reputation: 1130
There might be an easier way but this approach works:
# rows with common first columns
c1 = [(x, a[a[:,0]==x][0,1], b[i,1]) for i,x in enumerate(b[:,0]) if x in a[:,0]]
# rows of a not in b
c2 = [(x, a[i,1], np.nan) for i,x in enumerate(a[:,0]) if x not in b[:,0]]
# rows of b not in a
c3 = [(x, np.nan, b[i,1]) for i,x in enumerate(b[:,0]) if x not in a[:,0]]
# combine and sort
c4=np.vstack((c1,c2,c3))
c5=c4[c4[:,0].argsort()]
print c5
Upvotes: 1