Reputation: 736
I am using Python and have a data frame with a datetime index, a grouping variable (gvar) and a value variable (x). I would like to find all the common datetimes between the groups.
I already have a solution using functools, but I am seeking a way to do it using pandas functionalities only (if possible).
import functools
import pandas as pd
gvar = ['A', 'A', 'A', 'B', 'B', 'B']
x = [100, 200, 100, 200 , 100, 200]
ind = ['2018-01-01','2018-01-02', '2018-01-03', '2018-01-03', '2018-01-04', '2018-01-05' ]
df = pd.DataFrame(data={'gvar':gvar, 'x': x}, index=pd.to_datetime(ind))
common_time = functools.reduce(lambda x, y: pd.np.intersect1d(x, y), [df[df.gvar == x].index
for x in set(df.gvar)])
common_time
Out[39]: array(['2018-01-03T00:00:00.000000000'], dtype='datetime64[ns]')
All suggestions are welcome.
Upvotes: 0
Views: 308
Reputation: 30920
Use groupby.transform
:
df_filtered=df[df.groupby(level=0)['gvar'].transform('size').ge(2)]
print(df_filtered)
gvar x
2018-01-03 A 100
2018-01-03 B 200
df_filtered.index.unique()
#DatetimeIndex(['2018-01-03'], dtype='datetime64[ns]', freq=None)
Upvotes: 1
Reputation: 908
This should do it:
>>> df.reset_index().loc[df['gvar'].reset_index().drop_duplicates().duplicated('index'),'index'].tolist()
Returning:
[Timestamp('2018-01-03 00:00:00')]
And if you need the corresponding groups or values:
>>>df[df.index.isin(df.reset_index().loc[df['gvar'].reset_index().drop_duplicates().duplicated('index'),'index'].tolist())]
Giving you:
gvar x
2018-01-03 A 100
2018-01-03 B 200
Upvotes: 1