Reputation: 321
I'm attempting to shift a row based on whether or not another column is not null. There's inconsistent spacing in the Description column so I can't do a .shift()
Here's the original data
Permit Number A Description
1234 NaN NaN
NaN NaN NaN
NaN NaN foo
3456 NaN NaN
NaN NaN bar
And this is what I want my result to be
Permit Number A Description
1234 NaN foo
NaN NaN NaN
NaN NaN NaN
3456 NaN bar
NaN NaN NaN
Here's the code that I used from Align data in one column with another row, based on the last time some condition was true
mask = df['Description'].notnull()
fmask = (df['Permit Number'].notnull() & df['Description'].isnull())
df.assign(Description=df.groupby(mask[::-1].cumsum())['Description'].transform(lambda x: x.iloc[-1]).where(fmask))
However when I run it, no errors and no changes in the dataframe.
Upvotes: 0
Views: 94
Reputation: 321
FYI for anyone who sees this that might have a string in their column, this solutions works too
mask = df['Description'].notnull()
fmask = (df['Permit Number'].notnull() & df['Description'].isnull())
df = df.assign(Description=df.groupby(mask[::-1].cumsum())['Description'].transform(lambda x: x.iloc[-1]).where(fmask))
I had a simple error of not "writing" the newly grouped dataframe back to df
Upvotes: 0
Reputation: 544
If you do not need the NaN rows, you can go like following. I can't test it, but let me know if there's any typo.
this is the original.
Permit Number A Description
1234 NaN NaN
NaN NaN NaN
NaN NaN foo
3456 NaN NaN
NaN NaN bar
I try to achieve this one first :
df['Permit Number'] = df['Permit Number'].ffill()
Permit Number A Description
1234 NaN NaN
1234 NaN NaN
1234 NaN foo
3456 NaN NaN
3456 NaN bar
then you have the result as you ask.
df.groupby(['Permit Number','A'])['Description'].max().reset_index()
Permit Number A Description
1234 NaN foo
3456 NaN bar
Upvotes: 1