Reputation: 603
I've got a many row, many column dataframe with different 'placeholder' values needing substitution (in a subset of columns). I've read many examples in the forum using nested lists or dictionaries, but haven't had luck with variations..
# A test dataframe
df = pd.DataFrame({'Sample':['alpha','beta','gamma','delta','epsilon'],
'element1':[1,-0.01,-5000,1,-2000],
'element2':[1,1,1,-5000,2],
'element3':[-5000,1,1,-0.02,2]})
# List of headings containing values to replace
headings = ['element1', 'element2', 'element3']
And I am trying to do something like this (obviously this doesn't work):
# If any rows have value <-1, NaN
df[headings].replace(df[headings < -1], np.nan)
# If a value is between -1 and 0, make a replacement
df[headings].replace(df[headings < 0 & headings > -1], 0.05)
So, is there possibly a better way to accomplish this using loops or fancy pandas tricks?
Upvotes: 3
Views: 3337
Reputation: 603
Here is the successful answer as suggested by @Psidom.
The solution involves taking a slice out of the dataframe, applying the function, then reincorporates the amended slice:
df1 = df.loc[:, headings]
df1[df1 < -1] = np.nan
df1[(df1 < 0)] = 0.05
df.loc[:, headings] = df1
Upvotes: 2
Reputation: 215047
You can set the Sample
column as index and then replace values on the whole data frame based on conditions:
df = df.set_index('Sample')
df[df < -1] = np.nan
df[(df < 0) & (df > -1)] = 0.05
Which gives:
# element1 element2 element3
# Sample
# alpha 1.00 1.0 NaN
# beta 0.05 1.0 1.00
# gamma NaN 1.0 1.00
# delta 1.00 NaN 0.05
# epsilon NaN 2.0 2.00
Upvotes: 4