Reputation: 436
I have two pandas dataframes with following format:
df_ts = pd.DataFrame([
[10, 20, 1, 'id1'],
[11, 22, 5, 'id1'],
[20, 54, 5, 'id2'],
[22, 53, 7, 'id2'],
[15, 24, 8, 'id1'],
[16, 25, 10, 'id1']
], columns = ['x', 'y', 'ts', 'id'])
df_statechange = pd.DataFrame([
['id1', 2, 'ok'],
['id2', 4, 'not ok'],
['id1', 9, 'not ok']
], columns = ['id', 'ts', 'state'])
I am trying to get it to the format, such as:
df_out = pd.DataFrame([
[10, 20, 1, 'id1', None ],
[11, 22, 5, 'id1', 'ok' ],
[20, 54, 5, 'id2', 'not ok'],
[22, 53, 7, 'id2', 'not ok'],
[15, 24, 8, 'id1', 'ok' ],
[16, 25, 10, 'id1', 'not ok']
], columns = ['x', 'y', 'ts', 'id', 'state'])
I understand how to accomplish it iteratively by grouping by id and then iterating through each row and changing status when it appears. Is there a pandas build-in more scalable way of doing this?
Upvotes: 3
Views: 3780
Reputation: 2944
Unfortunately pandas merge support only equality joins. See more details at the following thread: merge pandas dataframes where one value is between two others if you want to merge by interval you'll need to overcome the issue, for example by adding another filter after the merge:
joined = a.merge(b,on='id')
joined = joined[joined.ts.between(joined.ts1,joined.ts2)]
Upvotes: 2
Reputation: 619
You can merge pandas data frames on two columns:
pd.merge(df_ts,df_statechange, how='left',on=['id','ts'])
in df_statechange
that you shared here there is no common values on ts in both dataframes. Apparently you just copied not complete data frame here. So i got this output:
x y ts id state
0 10 20 1 id1 NaN
1 11 22 5 id1 NaN
2 20 54 5 id2 NaN
3 22 53 7 id2 NaN
4 15 24 8 id1 NaN
5 16 25 10 id1 NaN
But indeed if you have common ts
in the data frames it will have your desired output. For example:
df_statechange = pd.DataFrame([
['id1', 5, 'ok'],
['id1', 8, 'ok'],
['id2', 5, 'not ok'],
['id2',7, 'not ok'],
['id1', 9, 'not ok']
], columns = ['id', 'ts', 'state'])
the output:
x y ts id state
0 10 20 1 id1 NaN
1 11 22 5 id1 ok
2 20 54 5 id2 not ok
3 22 53 7 id2 not ok
4 15 24 8 id1 ok
5 16 25 10 id1 NaN
Upvotes: 0