Zanam
Zanam

Reputation: 4807

Creating array of strings

I have an array of flags for various types as:

Data Type1 Type2 Type3
12   1     0     0
14   0     1     0
3    0     1     0
45   0     0     1

I want to create the following array:

Data TypeName
12   Type1   
14   Type2   
3    Type2   
45   Type3   

I tried creating an empty array of type strings as:

import numpy as np
z1 = np.empty(4, np.string_)
z1[np.where(Type1=1)] = 'Type1'

But this doesn't seem to give me desired results.

Edit: I can use pandas dataframe and each row has only 1 type either Type1, Type2, Type3

Edit2: Data Type1 Type2 Type3 are column names as in pandas dataframe but I was using numpy array with the implicit names as I have pointed in the example above.

Upvotes: 2

Views: 210

Answers (3)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210972

UPDATE: here is a mixture of a brilliant @Divakar's idea to use DataFrame.idxmax(1) method and using set_index() and reset_index() in order to get rid of pd.concat():

In [142]: df.set_index('Data').idxmax(1).reset_index(name='TypeName')
Out[142]:
   Data TypeName
0    12    Type1
1    14    Type2
2     3    Type2
3    45    Type3

OLD answer:

You can do it this way (Pandas solution):

In [132]: df.set_index('Data') \
            .stack() \
            .reset_index(name='val') \
            .query("val == 1") \
            .drop('val', 1)
Out[132]:
    Data level_1
0     12   Type1
4     14   Type2
7      3   Type2
11    45   Type3

Upvotes: 2

Divakar
Divakar

Reputation: 221704

Here's an approach abusing the fact that we have exactly one 1 per row starting from Type1 column with idxmax() to get the only occurrence of it per row -

pd.concat((df.Data, df.iloc[:,1:].idxmax(1)),axis=1)

Sample run -

In [42]: df
Out[42]: 
   Data  Type1  Type2  Type3
0    12      1      0      0
1    14      0      1      0
2     3      0      1      0
3    45      0      0      1

In [43]: pd.concat((df.Data, df.iloc[:,1:].idxmax(1)),axis=1)
Out[43]: 
   Data      0
0    12  Type1
1    14  Type2
2     3  Type2
3    45  Type3

Upvotes: 2

fuglede
fuglede

Reputation: 18221

One way to do this would be through

df.apply(lambda row: 'Type1' if row.Type1 else 'Type2' if row.Type2 else 'Type3', axis=1)

For example:

In [6]: df
Out[6]: 
   Data  Type1  Type2  Type3
0    12      1      0      0
1    14      0      1      0
2     3      0      1      0
3    45      0      0      1

In [7]: df['TypeName'] = df.apply(lambda row: 'Type1' if row.Type1 else 'Type2' if row.Type2 else 'Type3', axis=1)

In [9]: df.drop(['Type1', 'Type2', 'Type3'], axis=1)
Out[9]: 
   Data TypeName
0    12    Type1
1    14    Type2
2     3    Type2
3    45    Type3

Upvotes: 1

Related Questions