Pandas - Spreading different values in a column on many columns

Question

I have the following table:

Option 1	Option 2	Option 3
A	X	1
B	X	1
A	Y	1
C	X	1
B	Y	1

I need to split the values of each option on different columns based on the translation as follows: Option 1 values A or B becomes opt1000 with values opt1000-A and opt1000-B Option 1 value C becomes opt1001 with value opt1001-C Etc. as shown below:

Opt1000	Opt1001	Opt2000	Opt2001	Opt3000
opt1000-A	NO-opt1001	opt2000-X	NO-opt2001	1
opt1000-B	NO-opt1001	opt2000-X	NO-opt2001	1
opt1000-A	NO-opt1001	NO-opt2000	opt-2001-y	1
NO-opt1000	opt1001-C	opt2000-X	NO-opt2001	1
opt-1000-B	NO-opt1001	NO-opt2000	opt2001-Y	1

I have a translation file as follows:

Ancient Option name	Ancient Value	New option name	New value
Option 1	A	Opt1000	opt1000-A
Option 1	B	Opt1000	opt1000-B
Option 1	C	Opt1001	opt1001-C
Option 2	X	opt2000	opt2000-X
Option 2	Y	opt2001	opt2001-Y

Does anyone have an idea of how I could automate this to be performant? (My table has 20 columns and a 1000 lines).

I was thinking of first extracting unique option values to create a new dataframe with the new column names. (By checking the new option name in the translation table and adding the name to a set). Then filling the data frame with 0 (for example, or NaN), then placing the new values one by one in the new dataframe by scanning each value in the original dataframe. When all values are translated, the remaining 0 or NaN will bere placed by NO_new-option-name_.

Let me know if it is clear or need more details.

Pandas - Spreading different values in a column on many columns

Answers (1)

Related Questions