Reputation: 4807
I have an array of flags for various types as:
Data Type1 Type2 Type3
12 1 0 0
14 0 1 0
3 0 1 0
45 0 0 1
I want to create the following array:
Data TypeName
12 Type1
14 Type2
3 Type2
45 Type3
I tried creating an empty array of type strings as:
import numpy as np
z1 = np.empty(4, np.string_)
z1[np.where(Type1=1)] = 'Type1'
But this doesn't seem to give me desired results.
Edit: I can use pandas dataframe and each row has only 1 type either Type1, Type2, Type3
Edit2: Data Type1 Type2 Type3 are column names as in pandas dataframe but I was using numpy array with the implicit names as I have pointed in the example above.
Upvotes: 2
Views: 210
Reputation: 210972
UPDATE: here is a mixture of a brilliant @Divakar's idea to use DataFrame.idxmax(1) method and using set_index()
and reset_index()
in order to get rid of pd.concat()
:
In [142]: df.set_index('Data').idxmax(1).reset_index(name='TypeName')
Out[142]:
Data TypeName
0 12 Type1
1 14 Type2
2 3 Type2
3 45 Type3
OLD answer:
You can do it this way (Pandas solution):
In [132]: df.set_index('Data') \
.stack() \
.reset_index(name='val') \
.query("val == 1") \
.drop('val', 1)
Out[132]:
Data level_1
0 12 Type1
4 14 Type2
7 3 Type2
11 45 Type3
Upvotes: 2
Reputation: 221704
Here's an approach abusing the fact that we have exactly one 1
per row starting from Type1
column with idxmax()
to get the only occurrence of it per row -
pd.concat((df.Data, df.iloc[:,1:].idxmax(1)),axis=1)
Sample run -
In [42]: df
Out[42]:
Data Type1 Type2 Type3
0 12 1 0 0
1 14 0 1 0
2 3 0 1 0
3 45 0 0 1
In [43]: pd.concat((df.Data, df.iloc[:,1:].idxmax(1)),axis=1)
Out[43]:
Data 0
0 12 Type1
1 14 Type2
2 3 Type2
3 45 Type3
Upvotes: 2
Reputation: 18221
One way to do this would be through
df.apply(lambda row: 'Type1' if row.Type1 else 'Type2' if row.Type2 else 'Type3', axis=1)
For example:
In [6]: df
Out[6]:
Data Type1 Type2 Type3
0 12 1 0 0
1 14 0 1 0
2 3 0 1 0
3 45 0 0 1
In [7]: df['TypeName'] = df.apply(lambda row: 'Type1' if row.Type1 else 'Type2' if row.Type2 else 'Type3', axis=1)
In [9]: df.drop(['Type1', 'Type2', 'Type3'], axis=1)
Out[9]:
Data TypeName
0 12 Type1
1 14 Type2
2 3 Type2
3 45 Type3
Upvotes: 1