Reputation: 4158
I have a pandas dataframe with three columns:
Name Name2 DateTime
2016-06-10 05:22
2016-06-10 05:23
2016-06-10 14:25
Guest Guest1 2016-06-10 15:32
I have to replace empty spaces with NAN. So AccountName and AccountName2 of rows 1,2,3 and 4 should be NAN. I tried the below statement:
df3['Name'] = df3['Name'].replace(r'[^\s+]',np.nan, regex=True)
But since I have white spaces after "Guest " in Name, all 5 rows get replaced with NAN.
Edit:
This is our actual data.
Name Name2 DateTime
\t\t-\r\n\t \t\t-\r\n\t 2016-06-10 05:22
\t\t-\r\n\t \t\t-\r\n\t 2016-06-10 05:23
\t\t-\r\n\t \t\t-\r\n\t 2016-06-10 14:25
\t\tGuest\r\n\t \t\tGuest1\r\n\t 2016-06-10 15:32
I used this to remove those escape characters.
df['Name'] = df['Name'].str.replace('\r','').str.replace('\t','').str.replace('\n','').str.replace('-','')
After removing those characters, I am not sure what gets inserted in that place now.
Upvotes: 1
Views: 1649
Reputation: 863216
Another solution which found length
of data and then by boolean indexing
replacing all data with length 0
or 1
:
print (df.applymap(len))
Name Name2 DateTime
0 0 0 16
1 0 0 16
2 0 0 16
3 5 6 16
df[df.applymap(len) < 2] = np.nan
print (df)
Name Name2 DateTime
0 NaN NaN 2016-06-10 05:22
1 NaN NaN 2016-06-10 05:23
2 NaN NaN 2016-06-10 14:25
3 Guest Guest1 2016-06-10 15:32
Upvotes: 1
Reputation: 882
Since you dont need the spaces, this should work.
df3['Name'] = df3['Name'].replace('[\s]+',"", regex=True) #Gets rid of all blank spaces
df3['Name'] = df3['Name'].replace('',np.nan, regex=True) #Replace completely empty cells with nan
Upvotes: 1
Reputation: 1745
If you're sure there won't be spaces between the names, you could use .strip()
to remove any whitespace (or anything, really) off the ends of the string.
df3['Name'] = df3['Name'].strip().replace(r'[^\s+]',np.nan, regex=True)
Upvotes: 0