Reputation: 21420
I'm working with an eyetracker that samples at 500 hz, thus I have a measurement every 2ms. I've loaded this data into a dataframe and reindexed it with the time samples, such that my row indexes match the time, in ms, where the recording was made:
Here's a slice of this dataframe (note that the event
column contains empty strings right now):
x y cr event
2000 109.9 1133.7 4264.0
2001 109.0 1133.3 4267.0
2002 107.1 1130.6 4269.0
2003 104.3 1128.0 4270.0
2004 97.8 1125.2 4268.0
2005 89.8 1124.3 4266.0
2006 79.2 1121.3 4274.0
2007 69.5 1118.7 4284.0
2008 59.6 1116.7 4293.0
2009 53.0 1116.9 4291.0
2010 48.6 1117.1 4290.0
As you can see, there are no odd-numbered time indexes, again, because the machine samples every 2 ms.
My second DataFrame tracks TTLs that are sent to the eyetracker. These TTLs can occur at any time with millisecond resolution, so there can be odd-numbered timestamps, here. This second DataFrame has two columns, t
and msg
. Here's a slice:
t msg
0 11911 WarningOnset
1 12510 CueOnset
2 14693 WarningOnset
3 17009 WarningOnset
4 17642 CueOnset
My goal is to merge the second dataframe into the first, such that each item in the msg
column of dataframe2 gets placed into the event
column of dataframe1 at the index indicated by column t
.
This means that some of the indexes I'm targeting in dataframe1 do not yet exist, which is why my current approach is failing with a KeyError
.
Here's what I tried:
samples.loc[events['t'], 'event'] = events['msg'] # throws KeyError
What is the correct way of doing this? For clarity, I'd like to either place NaN
s in the columns that did not exist in dataframe2 (i.e. events
in the example above).
Upvotes: 2
Views: 77
Reputation: 394099
The reason that your initial attempt fails is that the values don't exist in your target dataframe, what you want to do is assign events where there is a match and set events to NaN or blank or whatever where there are no real events.
What you want to do is merge the other df to your first one on either an index (which should represent the same thing) or columns which should represent the same thing.
You will also need to rename your column in the second df.
So rename the column:
df1.rename({'msg':'event'})
Now set column 't' to be your index
df1.set_index(['t'],inplace=True)
Now merge:
df.merge(df1, left_index=True, right_index=True, how='left')
You can also use combine first which preserves the left hand side dataframes data:
df.combine_first(df1)
Upvotes: 1