Reputation: 27
I am currently trying to replace values in a dataset with reasonable data.
There are NAN values in the column 'Age' which I want to replace under these conditions:
If the name of the Person has the string "Mrs" in it and:
if Age value == Nan:
replace(Nan value with 40)
I am using this code:
c = dftrain[dftrain['Age'].isnull()]
a = c["Name"].str.contains("Mrs.")
c (boolean) = all rows with Nan for age
a = all rows with the string Mrs.
Please help me :) !!!
Upvotes: 1
Views: 121
Reputation: 11
Use pandas as Hietsh suggested above.
I would change only the condition format as specified below:
import pandas as pds
data = pds.read_excel('as1.xlsx')
df = pds.DataFrame(data, columns=['Product', 'Title', 'Name', 'Age'])
df.loc[((df['Age'].isnull()) & (df['Title'] == ('Mrs.'))), 'Age'] = 40
As a good reference I suggest the Pandas website
Upvotes: 0
Reputation: 1329
Hope below lines work for you...
Name Age
0 Mrs XYZ 21
1 Mr Devid NaN
2 Mrs OPQ NAN
#I have taken through excel you can use your own way
import pandas
df = pandas.read_excel('test.xlsx')
df.loc[df['Name'].str.contains('Mrs.') & df['Age'].isnull(), 'Age'] = 40
print(df)
# Output Frame -
Name Age
0 Mrs XYZ 21
1 Mr Devid NaN
2 Mrs OPQ 40
Upvotes: 1