How to create a tensor from sparse data?

Question

I have a dataset where each time point is represented by a set of sparse x and y values. For data storage purposes, if y = 0, that data point is not recorded.

Imagine data point t0:

#Real data
#t0
x0 = [200, 201, 202, 203, 204, 205, 206, 207, ...]
y0 = [5, 10, 0, 7, 0, 0, 15, 20, ...]

#Data stored
#t0
x0 = [200, 201, 203, 206, 207, ...]
y0 = [5, 10, 7, 15, 20, ...]

Now, imagine I have data point t1:

#Data stored
#t1
x1 = [201, 204, 206, 207, ...]
y1 = [10, 15, 3, 20, ...]

Is there a simple and efficient way to rebuild the full dataset for a custom number of data points? Let's say I want a data structure that represents all data contained in t0 + t1:

#t0+t1
M = [[200, 201, 203, 204, 206, 207, ...], # this contains all xs recorded for both t0 and t1
     [5, 10, 7, 0, 15, 20, ... ], # y values from t0. Missing values are filled with 0
     [0, 10, 0, 15, 3, 20, ...] # y values from t1. Missing values are filled with 0
]

Any help would be really appreciated!

mathfux · Accepted Answer

It looks like np.searchsorted is what you are looking for:

m0 = np.unique(x0 + x1) #assuming x0 and x1 are lists
M = np.zeros((3, len(m0)), dtype=int)
M[0] = m0
M[1, np.searchsorted(m0, x0)] = y0
M[2, np.searchsorted(m0, x1)] = y1
>>> M
array([[200, 201, 203, 204, 206, 207],
       [  5,  10,   7,   0,  15,  20],
       [  0,  10,   0,  15,   3,  20]])

How to create a tensor from sparse data?

Answers (1)

Related Questions