Simbi
Simbi

Reputation: 1042

Interleaving Pandas Dataframes by Timestamp

I've got 2 Pandas DataFrame, each of them containing 2 columns. One of the columns is a timestamp column [t], the other one contains sensor readings [s].

I now want to create a single DataFrame, containing 4 columns, that is interleaved on the timestamp column.

Example:

First Dataframe:

+----+----+
| t1 | s1 |
+----+----+
|  0 |  1 |
|  2 |  3 |
|  3 |  3 |
|  5 |  2 |
+----+----+

Second DataFrame:

+----+----+
| t2 | s2 |
+----+----+
|  1 |  5 |
|  2 |  3 |
|  4 |  3 |
+----+----+

Target:

+----+----+----+----+
| t1 | t2 | s1 | s2 |
+----+----+----+----+
|  0 |  0 |  1 |  0 |
|  0 |  1 |  1 |  5 |
|  2 |  1 |  3 |  5 |
|  2 |  2 |  3 |  3 |
|  3 |  2 |  3 |  3 |
|  3 |  4 |  3 |  3 |
|  5 |  4 |  2 |  3 |
+----+----+----+----+

I hat a look at pandas.merge, but that left me with a lot of NaNs and an unsorted table.

a.merge(b, how='outer')
Out[55]: 
   t1  s1  t2  s2
0   0   1 NaN NaN
1   2   3   2   3
2   3   3 NaN NaN
3   5   2 NaN NaN
4   1 NaN   1   5
5   4 NaN   4   3

Upvotes: 0

Views: 1074

Answers (1)

fixxxer
fixxxer

Reputation: 16154

Merging will put NaNs in common columns that you merge on, if those values are not present in both indexes. It will not create new data that is not present in the dataframes that are being merged. For example, index 0 in your target dataframe shows t2 with a value of 0. This is not present in the second dataframe, so you cannot expect it to appear in the merged dataframe either. Same applies for other rows as well.

What you can do instead is reindex the dataframes to a common index. In your case, since the maximum index is 5 in the target dataframe, lets use this list to reindex both input dataframes:

In [382]: ind
Out[382]: [0, 1, 2, 3, 4, 5]

Now, we will reindex according both inputs to this index:

In [372]: x = a.set_index('t1').reindex(ind).fillna(0).reset_index()

In [373]: x
Out[373]: 
   t1   s1
0   0    1
1   1    0
2   2    3
3   3    3
4   4    0
5   5    2

In [374]: y = b.set_index('t2').reindex(ind).fillna(0).reset_index()

In [375]: y
Out[375]: 
   t2   s2
0   0    0
1   1    5
2   2    3
3   3    0
4   4    5
5   5    0

And, now we merge it to get something close to the target dataframe:

In [376]: x.merge(y, left_on=['t1'], right_on=['t2'], how='outer')
Out[376]: 
   t1   s1  t2   s2
0   0    1   0    0
1   1    0   1    5
2   2    3   2    3
3   3    3   3    0
4   4    0   4    5
5   5    2   5    0

Upvotes: 1

Related Questions