Questieme
Questieme

Reputation: 993

How to update column value based on multiple conditions in Python?

I have the following Dataframe containing various product names and looks like this:

                                       Name  
0                            1 Enkelt (35%) 
1                          1 Klasses Bitter  
2               1 minute Urban Protect Mask  
3                       10 Years Tawny Port 
4                             100% Frugtbar  
5                       100% Klementinjuice  
6                            100% Kokosvand
7                    1000 kernerugbrød øko. 

See this product: 1000 kernerugbrød øko.. I am trying to put some conditions so that I remove the oko. from the end, and based on the Danish language rules regarding singular and plural, add either "Økologisk" (singular) or "Økologiske" (plural) in front of the name. In this case, because kernerugbrød does not end with the letter r, it should be Økologisk.

So basically the idea is like this:

I have a row containing this value in the Name column: 1000 kernerugbrød øko. -> I remove the oko., resulting into 1000 kernerugbrød -> I check whether the last letter is r or not -> Add Økologisk or Økologiske depending on the previous step -> Final string should then be: Økologisk 1000 kernerugbrød.

My attempt was the following:

text = "Økologisk "
text2 = "Økologiske "

df['test'] = df['Name'].str.contains(",?\søko.") #creating a new column containing 
                                 #booleans to check which Name contains "oko."

df['Name'] = df['Name'].str.replace(r',?\søko.', "") #replacing "oko." with empty string

if df['test']: #if the Name contained "oko."
    if df['Name'].str.contains("r(\s)?$"): #checking for plural
        df['Name'] = text2 + df['Name']
    else:
        df['Name'] = text + df['Name']

However, I am getting this error at if df['test'].

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I tried using the suggestions above but none of them actually helps me with this task. Therefore, what should I do to fix my code OR how else should my code be written in order to achieve a correct solution for this problem?

Upvotes: 2

Views: 874

Answers (1)

jezrael
jezrael

Reputation: 862641

I think you can use double numpy.where:

m1 = df['Name'].str.contains(",?\søko.") #creating a new column containing 
                                 #booleans to check which Name contains "oko."

df['Name'] = df['Name'].str.replace(r',?\søko.', "") #replacing "oko." with empty string

m2 = df['Name'].str.contains("r(\s)?$")

df['Name'] = np.where(~m1, df['Name'],
             np.where(m2, text2, text) + df['Name'])

Upvotes: 2

Related Questions