Finding intersection of pandas data frame index in groupby

Question

I am using Python and have a data frame with a datetime index, a grouping variable (gvar) and a value variable (x). I would like to find all the common datetimes between the groups.

I already have a solution using functools, but I am seeking a way to do it using pandas functionalities only (if possible).

import functools
import pandas as pd

gvar =  ['A', 'A', 'A', 'B', 'B', 'B']
x = [100, 200, 100, 200 , 100, 200]
ind = ['2018-01-01','2018-01-02', '2018-01-03', '2018-01-03', '2018-01-04', '2018-01-05' ]
df = pd.DataFrame(data={'gvar':gvar, 'x': x},  index=pd.to_datetime(ind))

common_time = functools.reduce(lambda x, y: pd.np.intersect1d(x, y), [df[df.gvar == x].index
                                       for x in set(df.gvar)])

common_time
Out[39]: array(['2018-01-03T00:00:00.000000000'], dtype='datetime64[ns]')

All suggestions are welcome.

ansev · Accepted Answer

Use groupby.transform:

df_filtered=df[df.groupby(level=0)['gvar'].transform('size').ge(2)]
print(df_filtered)

           gvar    x
2018-01-03    A  100
2018-01-03    B  200

df_filtered.index.unique()
#DatetimeIndex(['2018-01-03'], dtype='datetime64[ns]', freq=None)

Finding intersection of pandas data frame index in groupby

Answers (2)

Related Questions