Reputation: 288
My sample data looks like this:
data = {'index': ['001', '002', '003'],
'A' : ['red', 'green', 'blue'],
'B' : ['blue', 'yellow', 'green'],
'C' : ['green', 'blue', 'red'],
'A_new' : [2, 1, 3], 'B_new' : [0, 1, 2], 'C_new' : [0, 0, 1],
'A_old' : [1, 0, 1], 'B_old' : [1, 0, 0], 'C_old' : [0, 0, 2],
'A_other_new' : [2, 0, 1], 'A_other_old' : [1, 1, 0]}
df = pd.DataFrame (data, columns = ['index', 'A', 'B', 'C', 'A_new', 'B_new', 'C_new',
'A_old', 'B_old', 'C_old', 'A_other_new', 'A_other_old'])
df
output:
index A B C A_new B_new C_new A_old B_old C_old A_other_new A_other_old
0 001 red blue green 2 0 0 1 1 0 2 1
1 002 green yellow blue 1 1 0 0 0 0 0 1
2 003 blue green red 3 2 1 1 0 2 1 0
I am trying to move columns after column 'C'
to rows, based on index and other columns, but it has been very hard as I am just a beginner and learning Pandas. The point is to move each +1 value in columns after column 'C'
to a new row, connected via columns 'index'
, 'color'
, 'letter'
and 'reference'
to the original dataframe. Under column 'reference'
, if the original data columns 'A_other_new'
or 'A_other_old'
are > 0, I need to have a mark like 'OX'
(O is a letter, X is a number) for each +1 value. This part is important, I cannot skip it, for other steps I need to make.
Does someone have an idea on how to solve this?
index color letter reference age
00 001 red A A1 new
01 001 red A A2 new
02 001 red A A3 old
03 001 red A O1 new
04 001 red A O2 new
05 001 red A O3 old
06 001 blue B B1 old
07 001 green C C1 0
08 002 green A A1 new
09 002 green A O1 old
10 002 yellow B B1 new
11 002 blue C C1 0
12 003 blue A A1 new
13 003 blue A A2 new
14 003 blue A A3 new
15 003 blue A A4 old
16 003 blue A O1 new
17 003 green B B1 new
18 003 green B B2 new
19 003 red C C1 new
20 003 red C C2 old
21 003 red C C3 old
Upvotes: 2
Views: 517
Reputation: 7594
Not sure i understand the other
column in your desired output, but you can use melt
here. If you can explain what other means it will be better:
df = df.melt(id_vars=['index', 'A', 'B', 'C'],
var_name='other')
df['age'] = df['other'].str.split('_').str[1]
df['letter'] = df['other'].str.split('_').str[0]
print(df)
index A B C other value age letter
0 001 red blue green A_new 2 new A
1 002 green yellow blue A_new 1 new A
2 003 blue green red A_new 3 new A
3 001 red blue green B_new 0 new B
4 002 green yellow blue B_new 1 new B
5 003 blue green red B_new 2 new B
6 001 red blue green C_new 0 new C
7 002 green yellow blue C_new 0 new C
8 003 blue green red C_new 1 new C
9 001 red blue green A_old 1 old A
10 002 green yellow blue A_old 0 old A
11 003 blue green red A_old 1 old A
12 001 red blue green B_old 1 old B
13 002 green yellow blue B_old 0 old B
14 003 blue green red B_old 0 old B
15 001 red blue green C_old 0 old C
16 002 green yellow blue C_old 0 old C
17 003 blue green red C_old 2 old C
18 001 red blue green A_other_new 2 other A
19 002 green yellow blue A_other_new 0 other A
20 003 blue green red A_other_new 1 other A
21 001 red blue green A_other_old 1 other A
22 002 green yellow blue A_other_old 1 other A
23 003 blue green red A_other_old 0 other A
Upvotes: 1