Shyryu
Shyryu

Reputation: 149

Problematic values

I have a dataframe that has a 'Trousers' column containing many different types of trousers. Most of the trousers would start by their type. For instance: Jeans- Replay-blue, or Chino- Uniqlo-~, or maybe Smart-Next-~). Others would just have a type but just a long name (2 or 3 strings) What I want is to loop through that column to change the values to just Jean if jeans is in the cell,or Chinos if Chino is in the cell and so on.... so I can easily group them.

How can achieve that through with my for loop?

Upvotes: 1

Views: 80

Answers (1)

jezrael
jezrael

Reputation: 863226

It seems you need split and then select first value of lists by str[0]:

df['type'] = df['Trousers'].str.split('-').str[0]

Sample:

df = pd.DataFrame({'Trousers':['Jeans- Replay-blue','Chino- Uniqlo-~','Smart-Next-~']})
print (df)
             Trousers
0  Jeans- Replay-blue
1     Chino- Uniqlo-~
2        Smart-Next-~

df['type'] = df['Trousers'].str.split('-').str[0]
print (df)
             Trousers   type
0  Jeans- Replay-blue  Jeans
1     Chino- Uniqlo-~  Chino
2        Smart-Next-~  Smart

df['Trousers'] = df['Trousers'].str.split('-').str[0]
print (df)
  Trousers
0    Jeans
1    Chino
2    Smart

Another solution with extract:

df['Trousers'] = df['Trousers'].str.extract('([a-zA-z]+)-', expand=False)
print (df)
  Trousers
0    Jeans
1    Chino
2    Smart

Upvotes: 1

Related Questions