Reputation: 2048
I have a dataframe with a large number of columns that I would like to consolidate into more rows and less columns it has a similar structure to the example below:
| 1_a | 1_b | 1_c | 2_a | 2_b | 2_c | d |
|-----|-----|-----|-----|-----|-----|-----|
| 1 | 2 | 3 | 1 | 2 | 6 | z |
| 2 | 2 | 2 | 3 | 2 | 5 | z |
| 3 | 2 | 1 | 4 | 1 | 4 | z |
I want to combine some of the rows so they look like below:
| 1 | 2 | letter | d |
|---|---|--------|---|
| 1 | 1 | a | z |
| 2 | 3 | a | z |
| 3 | 4 | a | z |
| 2 | 2 | b | z |
| 2 | 2 | b | z |
| 2 | 1 | b | z |
| 3 | 6 | c | z |
| 2 | 5 | c | z |
| 1 | 4 | c | z |
I have created a new dataframe with the new headings, but am unsure how to map my original headings to the new headings when appending.
Thanks
Upvotes: 1
Views: 61
Reputation: 38415
Try
df = df.set_index('d')
df.columns = pd.MultiIndex.from_tuples([tuple(c.split('_')) for c in df.columns])
df = df.stack().reset_index().rename(columns = {'level_1' : 'letter'})
d letter 1 2
0 z a 1 1
1 z b 2 2
2 z c 3 6
3 z a 2 3
4 z b 2 2
5 z c 2 5
6 z a 3 4
7 z b 2 1
8 z c 1 4
Upvotes: 3
Reputation: 2528
For the most part, if you need to dynamically select column names you probably need to just write a Python loop. Just run through each letter manually then concat them together:
dfs = []
for letter in ('a', 'b', 'c'):
group = df[['d']]
group['1'] = df['1_' + letter]
group['2'] = df['2_' + letter]
group['letter'] = letter
dfs.append(group)
result = pd.concat(dfs)
Upvotes: 0