AfonsoSalgadoSousa
AfonsoSalgadoSousa

Reputation: 405

Change values of dataframe based on values of groupby

I have a dataframe generated from the following code:

vote_mode = dataset.groupby(['ini_num','dep_parl_group'])['vote'].agg(lambda x: disambiguated_mode(x)).to_frame()

This gives me the a three-column dataframe (ini_num, dep_parl_group, vote) where vote is the most frequent label, like the following:

ini_num dep_parl_group vote
12 A vot_in_favour
B vot_against
99 A vot_against
C vot_in_favour
D vot_against

I would like to change the vote values of the dataset (dataframe from which the groupby was built) to match the groupby dataframe attributes. The dataset is as follows:

ini_num dep_parl_group vote what I want
12 A vot_in_favour vot_in_favour
12 A vot_in_favour vot_in_favour
12 A vot_against vot_in_favour
12 B vot_against vot_against
12 B vot_against vot_against
99 A vot_against vot_against
99 A vot_against vot_against
99 A vot_in_favour vot_against
99 C vot_in_favour vot_in_favour
99 D vot_against vot_against
99 D vot_against vot_against

Specifically, I would like to have the vote values of every entry of dataset to match the corresponding ones in entries where the ini_num and dep_parl_group match.

Thanks in advance for any help you can provide.

Upvotes: 1

Views: 59

Answers (2)

Scott Boston
Scott Boston

Reputation: 153460

Try this, here I substituted for disambiguated_mode:

dataset['vote_1'] = (dataset.groupby(['ini_num','dep_parl_group'])['vote']
                            .transform(lambda x: x.mode()[0]))

Output:

    ini_num dep_parl_group           vote    what I want         vote_1
0        12              A  vot_in_favour  vot_in_favour  vot_in_favour
1        12              A  vot_in_favour  vot_in_favour  vot_in_favour
2        12              A    vot_against  vot_in_favour  vot_in_favour
3        12              B    vot_against    vot_against    vot_against
4        12              B    vot_against    vot_against    vot_against
5        99              A    vot_against    vot_against    vot_against
6        99              A    vot_against    vot_against    vot_against
7        99              A  vot_in_favour    vot_against    vot_against
8        99              C  vot_in_favour  vot_in_favour  vot_in_favour
9        99              D    vot_against    vot_against    vot_against
10       99              D    vot_against    vot_against    vot_against

Upvotes: 1

bui
bui

Reputation: 1651

You can set index of the original dataframe to ['ini_num', 'dep_parl_group'], then do a left join with vote_mode

dataset.set_index(['ini_num', 'dep_parl_group']).join(vote_mode, on=['ini_num', 'dep_parl_group'], lsuffix='_old', rsuffix='_new')

Upvotes: 1

Related Questions