Reputation: 187
I'm trying to write a function that takes as inputs a DataFrame with a column 'timestamp' and a list of tuples. Every tuple will contain a beginning and end time.
What I want to do is to "split" the dataframe in two new ones, where the first contains the rows for which the timestamp value is not contained between the extremes of any tuple, and the other is just the complementary. The number of filter tuples is not known a priori though.
df = DataFrame({'timestamp':[0,1,2,5,6,7,11,22,33,100], 'x':[1,2,3,4,5,6,7,8,9,1])
filt = [(1,4), (10,40)]
left, removed = func(df, filt)
This should give me two dataframes
[0,5,6,7,100]
[1,2,11,22,33]
I believe the right approach is to write a custom function that can be used as a filter, and then call is somehow to filter/mask the dataframe, but I could not find a proper example of how to implement this.
Upvotes: 0
Views: 95
Reputation: 323226
Check
out = df[~pd.concat([df.timestamp.between(*x) for x in filt]).any(level=0)]
Out[175]:
timestamp x
0 0 1
3 5 4
4 6 5
5 7 6
9 100 1
Upvotes: 3
Reputation: 15470
Can't you use filtering with .isin()
:
left,removed = df[df['timestamp'].isin([0,5,6,7,100])],df[df['timestamp'].isin([1,2,11,22,33])]
Upvotes: 0