ZuzanaTelefony
ZuzanaTelefony

Reputation: 79

Replacing all values based on specific value in column dataframe

I have a data frame like this containing 2041 columns.

number   error1   error2    ...   error2040
   1        0       0       ...       1
   2        1       1       ...       1
   3        0       1       ...       0
  ...      ...     ...      ...      ...
result     0.5      0.6               0.001

The result row is probability that the particular error will cause the final error and was calculated maybe not very nicely using piecesbut it works.

Now I want to divide all the numbers into four categories (faulty, probably faulty, not faulty, not enough information), based on the probability in 'result'

So I guess the easiest way is to replace all ones with particular value from 'result' and then add new column called 'prediction' based on the numbers in row so it would look like

number   error1   error2    ...   error2040    PREDICTION 
  1        0       0        ...     0.001      not faulty   
  2       0.5     0.6       ...     0.001      FAULTY   
  3        0      0.6       ...       0        probably faulty  
 ...      ...     ...       ...      ... 
result    0.5     0.6                0.001

But I am stuck and cannot find out how to do the first part - to replace all 1 in all columns with the value from the 'result' row.

Thank you.

Upvotes: 0

Views: 65

Answers (1)

SeaBean
SeaBean

Reputation: 23217

Based on 1) my initial idea of using multiplication instead of replace and 2) riding on @piRSquared's syntax together with 3) modification to exclude first column for operation, you can use:

df.iloc[:-1, 1:] *= df.iloc[-1, 1:]

Test run:

data = {'number': {0: '1', 1: '2', 2: '3', 3: 'result'},
 'error1': {0: 0.0, 1: 1.0, 2: 0.0, 3: 0.5},
 'error2': {0: 0.0, 1: 1.0, 2: 1.0, 3: 0.6},
 'error2040': {0: 1.0, 1: 1.0, 2: 0.0, 3: 0.001}}

df = pd.DataFrame(data)
print(df)

   number  error1  error2  error2040
0       1     0.0     0.0      1.000
1       2     1.0     1.0      1.000
2       3     0.0     1.0      0.000
3  result     0.5     0.6      0.001


df.iloc[:-1, 1:] *= df.iloc[-1, 1:]

print(df)

   number error1 error2 error2040
0       1    0.0    0.0     0.001
1       2    0.5    0.6     0.001
2       3    0.0    0.6       0.0
3  result    0.5    0.6     0.001

Upvotes: 1

Related Questions