Operating on multiindex dataframe

Question

My dataframe head() looks like this:

                                         followers  following
experience   userid date                                     
Intermediate 0      2010-01-02 05:28:38      84330       1331
                    2010-01-02 18:46:36      84330       1331
                    2010-01-02 18:47:22      84330       1331
                    2010-01-02 18:50:12      84330       1331
                    2010-01-02 23:08:55      84330       1331

Where I have more rows. For each userid, I'd like to subtract the first followers value (e.g., in example above, subtract 84330 from all dates in userid=0). Is there some apply() command that will do this?

user29791 · Accepted Answer

find the first row for each userid:

cut = df.groupby('userid').first()

Then merge it back to the original table:

cut.columns = cut.columns.map(lambda x: str(x) + '_a')
df2 = df.merge(cut, left_on=['userid'], right_index=True, how='left')

Then remove these rows that match the first rows value

new = df2[df2['followers'] != df2['followers']][['userid','date','followers']]

Operating on multiindex dataframe

Answers (1)

Related Questions