I have a dataset where I need to convert some columns into single categorical variables & concatenating multi true values

Question

I have a dataset with unique ID with columns which have several bool values for each ID. Therefore, I need to convert these columns into a single categorical variables concatenating multiple true bool values. And if out of mentioned bool ID have 3 true values we assign to categories as "Win"

ID	BoolCol_1	BoolCol_2	BoolCol_3	BoolCol_4	Other Col 1	Other Col 2
1	1	2	2	1	x	Y
2	2	1	1	1	A	b

1 -> True 2 -> False

ID are unique.

I am not able to think in my head how to solve this puzzle

RomanPerekhrest · Accepted Answer

Use the following approach:

bool_cols = ['BoolCol_1', 'BoolCol_2', 'BoolCol_3', 'BoolCol_4']
cnts = df[bool_cols].stack().groupby(level=0).value_counts().unstack()[1]
df['cat_col'] = pd.Series(np.where(cnts >= 3, 'W', 'L'), dtype='category')

Now cat_col is categorical column with fixed values W (win), L (lose)

In [229]: df
Out[229]: 
   ID  BoolCol_1  BoolCol_2  BoolCol_3  BoolCol_4 Other Col 1 Other Col 2 cat_col
0   1          1          2          2          1          x            Y       L
1   2          2          1          1          1          A            b       W

I have a dataset where I need to convert some columns into single categorical variables & concatenating multi true values

Answers (2)

Related Questions

I have a dataset where I need to convert some columns into single categorical variables &amp; concatenating multi true values

Answers (2)

Related Questions

I have a dataset where I need to convert some columns into single categorical variables & concatenating multi true values