Reputation: 113
I have a dataframe filled with several columns. I need to change the values of a column for data normalization like in the following example:
User_id
751730951
751730951
0
163526844
...and so on
I need to replace every value in the column that is not 0 (string) in a into something like "is not empty". I have tried it now for hours but still cannot change every value that is not 0 into something else. Replace()-function don't work really good for that. Some good ideas?
EDIT (my solution):
finalResult.loc[finalResult['update_user'] == '0', 'update_user'] = 'empty'
finalResult.loc[finalResult['update_user'] != 'empty', 'update_user'] = 'not empty'
Upvotes: 2
Views: 18959
Reputation: 812
Suppose we use a Series with the data specified in the question, named user_id, with a single line you do what you need:
user_id.where(user_id == 0).fillna('is not empty')
I don't like loc very much since I think it complicates the reading.
It might be better than replace because it allows the opposite case:
user_id.where(user_id != 0).fillna('is empty')
Upvotes: 1
Reputation: 3777
df.loc[df['mycolumn'] != '0', 'mycolumn'] = 'not empty'
or if the value is an int,
df.loc[df['mycolumn'] != 0, 'mycolumn'] = 'not empty'
df.loc[rows, cols]
allows you to get or set a range of values in your DataFrame. First parameter is rows, in which case I'm using a boolean mask to get all rows that don't have a 0 in mycolumn
. The second parameter is the column you want to get/set. Since I'm replacing the same column I queried from, it is also mycolumn
.
I then simply using the assignment operator to assign the value of 'not empty' like you wanted.
If you want a new column to contain the 'not empty' so you're not contaminating your original data in mycolumn
, you can do:
df.loc[df['mycolumn'] != 0, 'myNewColumnsName'] = 'not empty'
Upvotes: 5
Reputation: 862481
Simpliest is use:
df['User_id'] = df['User_id'].replace('0', 'is not empty')
If 0
is int
:
df['User_id'] = df['User_id'].replace(0, 'is not empty')
Upvotes: 4