Shawn
Shawn

Reputation: 603

Pandas: replace list of values from list of columns

I've got a many row, many column dataframe with different 'placeholder' values needing substitution (in a subset of columns). I've read many examples in the forum using nested lists or dictionaries, but haven't had luck with variations..

# A test dataframe
df = pd.DataFrame({'Sample':['alpha','beta','gamma','delta','epsilon'],
                  'element1':[1,-0.01,-5000,1,-2000], 
                  'element2':[1,1,1,-5000,2], 
                  'element3':[-5000,1,1,-0.02,2]})

# List of headings containing values to replace
headings = ['element1', 'element2', 'element3']

And I am trying to do something like this (obviously this doesn't work):

 # If any rows have value <-1, NaN
 df[headings].replace(df[headings < -1], np.nan)

 # If a value is between -1 and 0, make a replacement
 df[headings].replace(df[headings < 0 & headings > -1], 0.05)

So, is there possibly a better way to accomplish this using loops or fancy pandas tricks?

Upvotes: 3

Views: 3337

Answers (2)

Shawn
Shawn

Reputation: 603

Here is the successful answer as suggested by @Psidom.

The solution involves taking a slice out of the dataframe, applying the function, then reincorporates the amended slice:

df1 = df.loc[:, headings]
df1[df1 < -1] = np.nan
df1[(df1 < 0)] = 0.05 
df.loc[:, headings] = df1

Upvotes: 2

akuiper
akuiper

Reputation: 215047

You can set the Sample column as index and then replace values on the whole data frame based on conditions:

df = df.set_index('Sample')
df[df < -1] = np.nan
df[(df < 0) & (df > -1)] = 0.05

Which gives:

#           element1    element2    element3
#  Sample           
#   alpha       1.00        1.0          NaN
#    beta       0.05        1.0         1.00
#   gamma        NaN        1.0         1.00
#   delta       1.00        NaN         0.05
# epsilon        NaN        2.0         2.00

Upvotes: 4

Related Questions