Reputation: 305
I have an array of timestamps arrays where each timestamp array is of different length. For example,
[arr1, arr2, arr3....]
arr1 = [0.24, 0.56, 0.77]
arr2 = [0.1, 0.24]
arr3 = [0.6, 0.7, 0.72, 0.88]
This is what the output should look like:
NaN, 0.24, 0.56, Nan, Nan, Nan, 0.77, Nan
0.1, 0.24, Nan, Nan, Nan, Nan, Nan, Nan
Nan, Nan, Nan, 0.6, 0.7, 0.72, NaN, Nan
How do I go on to merge all of these arrays into a single 2D matrix? Another note is that each individual array (arr1, arr2, ..)
are very large in size (tens of thousands).
I feel pandas merge
function can be used but I don't know how to proceed with it.
Upvotes: 2
Views: 321
Reputation: 765
arr1 = [0.24, 0.56, 0.77]
arr2 = [0.1, 0.24]
arr3 = [0.6, 0.7, 0.72, 0.88]
list2=[arr1, arr2, arr3]
ss1=pd.Series(pd.DataFrame(list2).to_numpy().reshape(-1)).dropna().drop_duplicates().sort_values()
pd.Series(list2).apply(lambda x:ss1.where(lambda ss:ss.isin(x)))
4 0 1 8 9 10 2 11
0 NaN 0.24 0.56 NaN NaN NaN 0.77 NaN
1 0.1 0.24 NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN 0.6 0.7 0.72 NaN 0.88
Upvotes: -1
Reputation: 150805
Here's an approach with pandas:
arrs = [arr1,arr2,arr3]
# convert to series
series = [pd.Series(arr,index=arr) for arr in arrs]
# concat with reindex
pd.concat(series, axis=1)
Output:
0 1 2
0.10 NaN 0.10 NaN
0.24 0.24 0.24 NaN
0.56 0.56 NaN NaN
0.60 NaN NaN 0.60
0.70 NaN NaN 0.70
0.72 NaN NaN 0.72
0.77 0.77 NaN NaN
0.88 NaN NaN 0.88
Upvotes: 2