Boosted_d16
Boosted_d16

Reputation: 14062

fill values from cell above in a given column

For the column labelled "Category", I want to fill the cells with white spaces with the value from above, see df number 2 in image below.

Here is what I tried but it didn't work:

df[df['Category']==" "] = np.NaN
df = df['Category'].fillna(method='ffill')
col = ['Category']
df.loc[:,col] = df.loc[:,col].ffill()
df.loc[:,['Category']] = df.loc[:,['Category']].ffill()
df = df.set_index(df.columns[0])
col = ['Category']
df.loc[:,col] = df.loc[:,col].ffill()

I want to then keep the 2nd instance of each row (if that makes sense), see df 3 in image below.

enter image description here

Upvotes: 0

Views: 1180

Answers (3)

René
René

Reputation: 4827

You can try:

import pandas as pd
import numpy as np

df = pd.DataFrame(
    [['Base', 152],
     ['Male', 98], ['-', .64],
     ['Female', 52], ['-', .34],
     ['Non-binary', '-'], ['-', '-'],
     ['Prefer-not', 2], ['-', .01]],
    columns=('category', 'engagement')
)

df = df.replace('-', np.nan)
df['category'] = df['category'].ffill()

Result:

            engagement
category              
Base            152.00
Female            0.34
Male              0.64
Non-binary         NaN
Prefer-not        0.01

Upvotes: 0

ansev
ansev

Reputation: 30920

We can transform to NaN using Series.mask and then drop duplicated:

df['Category']=df['Category'].mask(df['Category'].eq('')|df['Category'].isnull()).ffill()

Upvotes: 1

Boosted_d16
Boosted_d16

Reputation: 14062

With @ansev's help I have a working version of the code:

#fill whitespaces in 'Category' with values from cell above
df['Category'] = df['Category'].mask(df['Category'].eq('')).ffill()

#drop duplicates in column 'Category' and keep last instance
df = df.drop_duplicates(subset=['Category'], keep='last')

Upvotes: 0

Related Questions