Reputation: 811
I have a pandas.DataFrame in it I have a column. The columns contains, integers, strings, time...
I want to create columns (containing [0,1]) that tells if the value in that column is a string or not, a time or not... in an efficient way.
A
0 Hello
1 Name
2 123
3 456
4 22/03/2019
And the output should be
A A_string A_number A_date
0 Hello 1 0 0
1 Name 1 0 0
2 123 0 1 0
3 456 0 1 0
4 22/03/2019 0 0 1
Upvotes: 2
Views: 2356
Reputation: 23099
We can use the native pandas .to_numeric
, to_datetime
to test for dates & numbers. Then we can use .loc
for assignment and fillna
to match your target df.
df.loc[~pd.to_datetime(df['A'],errors='coerce').isna(),'A_Date'] = 1
df.loc[~pd.to_numeric(df['A'],errors='coerce').isna(),'A_Number'] = 1
df.loc[(pd.to_numeric(df['A'],errors='coerce').isna())
& pd.to_datetime(df['A'],errors='coerce').isna()
,'A_String'] = 1
df = df.fillna(0)
print(df)
A A_Date A_Number A_String
0 Hello 0.0 0.0 1.0
1 Name 0.0 0.0 1.0
2 123 0.0 1.0 0.0
3 456 0.0 1.0 0.0
4 22/03/2019 1.0 0.0 0.0
Upvotes: 2
Reputation: 28689
Using pandas str methods to check for the string type could help:
df = pd.read_clipboard()
df['A_string'] = df.A.str.isalpha().astype(int)
df['A_number'] = df.A.str.isdigit().astype(int)
#naive assumption
df['A_Date'] = (~df.A.str.isalnum()).astype(int)
df.filter(['A','A_string','A_number','A_Date'])
A A_string A_number A_Date
0 Hello 1 0 0
1 Name 1 0 0
2 123 0 1 0
3 456 0 1 0
4 22/03/2019 0 0 1
Upvotes: 4