eagerstudent
eagerstudent

Reputation: 277

Conditional copy of values from one column to another columns

I have a pandas dataframe that looks something like this:

 name        job             jobchange_rank   date
Thisguy      Developer       1                2012
Thisguy      Analyst         2                2014
Thisguy      Data Scientist  3                2015
Anotherguy   Developer       1                2018

The jobchange_rank represents the each individual's (based on name) ranked change in position, where rank nr 1 represent his/her first position nr 2 his/her second position, etc.

Now for the fun part. I want to create a new column where I can see a person's previous job, something like this:

 name        job             jobchange_rank   date   previous_job
Thisguy      Developer       1                2012   None
Thisguy      Analyst         2                2014   Developer
Thisguy      Data Scientist  3                2015   Analyst
Anotherguy   Developer       1                2018   None

I've created the following code to get the "None" values where there was no job change:

df.loc[df['jobchange_rank'].sub(df['jobchange_rank'].min()) == 0, 'previous_job'] = 'None'  

Sadly, I can't seem to figure out how to get the values from the other column where the needed condition applies.

Any help is more then welcome! Thanks in advance.

Upvotes: 3

Views: 567

Answers (1)

user3483203
user3483203

Reputation: 51155

This answer assumes that your DataFrame is sorted by name and jobchange_rank, if that is not the case, sort first.


# df = df.sort_values(['name', 'jobchange_rank'])

m = df['name'].eq(df['name'].shift())
df['job'].shift().where(m)

0          NaN
1    Developer
2      Analyst
3          NaN
Name: job, dtype: object

Or using a groupby + shift (assuming at least sorted by jobchange_rank)

df.groupby('name')['job'].shift()

0          NaN
1    Developer
2      Analyst
3          NaN
Name: job, dtype: object

Although the groupby + shift is more concise, on larger inputs, if your data is already sorted like your example, it may be faster to avoid the groupby and use the first solution.

Upvotes: 3

Related Questions