Reputation: 33
how are you? I'm pretty new at code, and I have this question:
I want to iterate through a column, and I want to change this column values based on a condition, in this case, I want to change very value from column 'a1': if the value contains the word 'Juancito' I want to change it to just 'Juancito'. The for loop works OK, but the value doesn't change in the end.
What I'm doing wrong?
import pandas as pd
inp = [{'a1':'Juancito 1'}, {'a1':'Juancito 2'}, {'a1':'Juancito 3'}]
df = pd.DataFrame(inp)
for i in df['a1']:
if 'Juancito' in i:
i = 'Juancito'
else:
pass
df.head()
Upvotes: 3
Views: 226
Reputation: 34056
You don't need a for
loop.
Just use numpy.where
with Series.str.contains
:
In [83]: import numpy as np
In [84]: df['a1'] = np.where(df['a1'].str.contains('Juancito'), 'Juancito', df['a1'])
In [85]: df
Out[85]:
a1
0 Juancito
1 Juancito
2 Juancito
Upvotes: 3
Reputation: 9047
As mentioned in the comment, incorporated the changes
import pandas as pd
inp = [{'a1':'Juancito 1'}, {'a1':'Juancito 2'}, {'a1':'Juancito 3'}]
df = pd.DataFrame(inp)
for index, i in enumerate(df['a1'])::
if 'Juancito' == i:
df['a1'].loc[index] = 'Juancito'
else:
pass
df.head()
Upvotes: 2
Reputation: 445
You are not setting the value of the dataframe element in your code. You are just assigning it temporarily to i. One approach would be to use numeric indexing where row number will change with i and column number would be the one you want to process.
Example:
import pandas as pd
inp = [{'a1':'Juancito 1'}, {'a1':'Juancito 2'}, {'a1':'Juancito 3'}]
df = pd.DataFrame(inp)
for i in range(len(df)):
if 'Juancito' in df.iloc[i][0]:
df.iloc[i][0] = 'Juancito'
else:
pass
df.head()
Upvotes: 2
Reputation: 2804
Try this:
import pandas as pd
inp = [{'a1':'Juancito 1'}, {'a1':'Juancito 2'}, {'a1':'Juancito 3'}]
df = pd.DataFrame(inp)
for i in range(df["a1"].shape[0]):
print(i)
if 'Juancito' in df["a1"][i]:
df["a1"][i] = 'Juancito'
else:
pass
df.head()
Upvotes: 2