Reputation: 1750
I have a table where two different types of columns have been stacked into the field
column - attributes and questions.
+-------+------------+-------+
| id | field | value |
+-------+------------+-------+
| 52394 | gender | M |
| 52394 | age | 24 |
| 52394 | question_1 | 2 |
| 52394 | question_2 | 1 |
+-------+------------+-------+
I want to reshape it so that gender and age become columns while question_1 and question_2 remain stacked.
+-------+--------+-----+------------+-------+
| id | gender | age | field | value |
+-------+--------+-----+------------+-------+
| 52394 | M | 24 | question_1 | 2 |
| 52394 | M | 24 | question_2 | 1 |
+-------+--------+-----+------------+-------+
Any ideas on how to do this?
Upvotes: 2
Views: 324
Reputation: 5109
This would be my strategy:
Apply pivot to your df where field is gender or age, save as df1. Select the df where field is not gender or age, save as df2. Then merge the two (df1 and df2) on id. Here is my full code:
import pandas as pd
import sys
if sys.version_info[0] < 3:
from StringIO import StringIO
else:
from io import StringIO
# Create df
rawText = StringIO("""
id field value
52394 gender M
52394 age 24
52394 question_1 2
52394 question_2 1
""")
df = pd.read_csv(rawText, sep = "\s+")
df1 = df[df['field'].isin(['gender','age'])]
df1 = df1.pivot(index = 'id', columns = 'field', values = 'value').reset_index()
df2 = df[~df['field'].isin(['gender','age'])]
df1.merge(df2)
The result is:
id age gender field value
0 52394 24 M question_1 2
1 52394 24 M question_2 1
Upvotes: 3