user3447653
user3447653

Reputation: 4158

Replaces spaces with NAN in pandas dataframe

I have a pandas dataframe with three columns:

Name        Name2           DateTime
                            2016-06-10 05:22
                            2016-06-10 05:23
                            2016-06-10 14:25
Guest       Guest1          2016-06-10 15:32

I have to replace empty spaces with NAN. So AccountName and AccountName2 of rows 1,2,3 and 4 should be NAN. I tried the below statement:

 df3['Name'] = df3['Name'].replace(r'[^\s+]',np.nan, regex=True)

But since I have white spaces after "Guest " in Name, all 5 rows get replaced with NAN.

Edit:

This is our actual data.

Name              Name2                  DateTime
\t\t-\r\n\t      \t\t-\r\n\t            2016-06-10 05:22
\t\t-\r\n\t      \t\t-\r\n\t            2016-06-10 05:23
\t\t-\r\n\t      \t\t-\r\n\t            2016-06-10 14:25
\t\tGuest\r\n\t  \t\tGuest1\r\n\t       2016-06-10 15:32

I used this to remove those escape characters.

df['Name'] = df['Name'].str.replace('\r','').str.replace('\t','').str.replace('\n','').str.replace('-','')

After removing those characters, I am not sure what gets inserted in that place now.

Upvotes: 1

Views: 1649

Answers (3)

jezrael
jezrael

Reputation: 863216

Another solution which found length of data and then by boolean indexing replacing all data with length 0 or 1:

print (df.applymap(len))
   Name  Name2  DateTime
0     0      0        16
1     0      0        16
2     0      0        16
3     5      6        16

df[df.applymap(len) < 2] = np.nan
print (df)
    Name   Name2          DateTime
0    NaN     NaN  2016-06-10 05:22
1    NaN     NaN  2016-06-10 05:23
2    NaN     NaN  2016-06-10 14:25
3  Guest  Guest1  2016-06-10 15:32

Upvotes: 1

Yarnspinner
Yarnspinner

Reputation: 882

Since you dont need the spaces, this should work.

df3['Name'] = df3['Name'].replace('[\s]+',"", regex=True) #Gets rid of all blank spaces
df3['Name'] = df3['Name'].replace('',np.nan, regex=True) #Replace completely empty cells with nan

Upvotes: 1

Michael Zhang
Michael Zhang

Reputation: 1745

If you're sure there won't be spaces between the names, you could use .strip() to remove any whitespace (or anything, really) off the ends of the string.

 df3['Name'] = df3['Name'].strip().replace(r'[^\s+]',np.nan, regex=True)

Upvotes: 0

Related Questions