sayan_sen
sayan_sen

Reputation: 345

how to filter out rows in pandas which are just numbers and not fully numeric?

I have a Pandas dataframe column which has data in rows such as below:

col1     
abc
ab23
2345
fgh67@
8980

I need to create 2 more columns col 2 and col 3 as such below:

 col2    col3
 abc      2345
 ab23      8980
 fgh67@   

I have used str.isnumeric(), but thats not helping me in a dataframe column. can someone kindly help?

Upvotes: 5

Views: 11263

Answers (1)

jezrael
jezrael

Reputation: 862741

Use str.isnumeric or to_numeric with check non NaNs for boolean mask and filter by boolean indexing:

m = df['col1'].str.isnumeric()
#alternative
#m = pd.to_numeric(df['col1'], errors='coerce').notnull()
df = pd.concat([df.loc[~m, 'col1'].reset_index(drop=True), 
                df.loc[m, 'col1'].reset_index(drop=True)], axis=1, keys=('col2','col3'))
print (df)
     col2  col3
0     abc  2345
1    ab23  8980
2  fgh67@   NaN

If want add new columns to existed DataFrame with align by indices:

df['col2'] = df.loc[~m, 'col1']
df['col3'] = df.loc[m, 'col1']
print (df)
     col1    col2  col3
0     abc     abc   NaN
1    ab23    ab23   NaN
2    2345     NaN  2345
3  fgh67@  fgh67@   NaN
4    8980     NaN  8980

Or without align:

df['col2'] = df.loc[~m, 'col1'].reset_index(drop=True)
df['col3'] = df.loc[m, 'col1'].reset_index(drop=True)
print (df)
     col1    col2  col3
0     abc     abc  2345
1    ab23    ab23  8980
2    2345  fgh67@   NaN
3  fgh67@     NaN   NaN
4    8980     NaN   NaN

Upvotes: 4

Related Questions