Shyam
Shyam

Reputation: 1

how to change the value of one column based on another column value in Dask Dataframe

I have a huge Dataframe that I'm reading using Dask dataFrame. In pandas I use,

df.loc[df['Ref']!='ABC','Ref2'] = np.nan

Then I frontfill the changed column as shown below,

df['Ref2'] = df['Ref2'].fillna(method = 'ffill')

for making a change in a column based on condition on another column value.

How can the same be achieved using Dask Dataframe?

I'm new to Dask Dataframe

Upvotes: 0

Views: 798

Answers (2)

SultanOrazbayev
SultanOrazbayev

Reputation: 16561

A different way to write this (closer to the pandas syntax):

mask = df['Ref']!='ABC'
df.loc[mask,'Ref2'] = np.nan
df['Ref2'] = df['Ref2'].fillna(method = 'ffill')

dask closely follows pandas syntax, so often the pandas expression will work.

Upvotes: 0

jezrael
jezrael

Reputation: 862641

Use dask.dataframe.Series.mask and dask.dataframe.Series.fillna:

df['Ref2'] = df['Ref2'].mask(df['Ref']!='ABC').fillna(method = 'ffill')

Upvotes: 1

Related Questions