Replacing string in pandas python only if it matches the exact string

Question

I'm having trouble getting a string in pandas to be replaced in the correct manner. I'm not sure if I'm limited to using pandas and there might not be a way to do this with solely using pandas.

This is how my dataframe looks:

 (ID: 10)              247333605                      0.0  
  (ID: 20)               36738870                      0.0  
  (ID: 40)             4668036427                      0.0  
  (ID: 50)             1918647972                      0.0  
  (ID: 60)             4323165902                  44125.0  
  (ID: 80)              145512255                      0.0  
 Assigned (ID: 30)       42050340                      0.0  
 Assigned (ID: 40)   130880371376                      0.0  
 Assigning (ID: 30)    1095844753                      0.0  
 Cancelled (ID: 40)        937280                      0.0  
 Cancelled (ID: 80)   16857720813                      0.0  
 Planned (ID: 20)      9060392597                      0.0  
 Planning (ID: 10)   108484297031                      0.0  
 Processed (ID: 70)  133289880880                      0.0  
 Revoked (ID: 50)      2411903072                      0.0  
 Writing (ID: 50)    146408550024                      0.0  
 Written (ID: 60)    139458227923                1018230.0

For each (ID: x), it should be matched to the assigned (ID: x), cancelled (ID: x), etc with the correct ID.

Using lines similar to this line:

input_data['last_status'] = input_data.last_status.str.replace('(ID: 10)', 'Planning (ID: 10)')

My output is:

(Assigned (ID: 40))                                0.0  
  (Cancelled (ID: 80))                               0.0  
  (Planned (ID: 20))                                 0.0  
  (Planning (ID: 10))                                0.0  
  (Writing (ID: 50))                                 0.0  
  (Written (ID: 60))                             44125.0  
 Assigned (Assigned (ID: 40))                        0.0  
 Assigned (ID: 30)                                   0.0  
 Assigning (ID: 30)                                  0.0  
 Cancelled (Assigned (ID: 40))                       0.0  
 Cancelled (Cancelled (ID: 80))                      0.0  
 Planned (Planned (ID: 20))                          0.0  
 Planning (Planning (ID: 10))                        0.0  
 Processed (ID: 70)                                  0.0  
 Revoked (Writing (ID: 50))                          0.0  
 Writing (Writing (ID: 50))                          0.0  
 Written (Written (ID: 60))                    1018230.0

As you can see, all the (ID: x) got replaced and it still doesn't match the correct term.

My ideal dataframe would look like this:

 Assigned (ID: 30)       42050340                      0.0  
 Assigned (ID: 40)   130880371376                      0.0  
 Assigning (ID: 30)    1095844753                      0.0  
 Cancelled (ID: 40)        937280                      0.0  
 Cancelled (ID: 80)   16857720813                      0.0  
 Planned (ID: 20)      9060392597                      0.0  
 Planning (ID: 10)   108484297031                      0.0  
 Processed (ID: 70)  133289880880                      0.0  
 Revoked (ID: 50)      2411903072                      0.0  
 Writing (ID: 50)    146408550024                      0.0  
 Written (ID: 60)    139458227923                1018230.0

I'm bound to using pandas because the dataset is huge, I have a different implementation but they take me days to run. Is there a way to do this right in pandas?

I've never asked something before on stackoverflow. I hope my question is clear.

Replacing string in pandas python only if it matches the exact string

Answers (1)

Related Questions