Reputation: 87
I have a Dataframe like:
matrix = [(222, ['A','B','C'], [1,2,3]),
(333, ['A','B','D'], [1,3,5])]
df = pd.DataFrame(matrix, columns=['timestamp', 'variable', 'value'])
timestamp variable value
222 ['A','B','C'] [1,2,3]
333 ['A','B','D'] [1,3,5]
and would like to pivot it so that the timestamp
value is kept, the unique values in the variable
column become additional columns, and values from value
are sorted in the respective columns.
The output should look as follows:
timestamp A B C D
222 1 2 3 nan
333 1 3 nan 5
any help would be greatly appreciated! :)
Upvotes: 4
Views: 173
Reputation: 863166
Create dictionary with zip, pass to DataFrame
constructor:
a = [dict(zip(*x)) for x in zip(df['variable'], df['value'])]
print (a)
[{'A': 1, 'B': 2, 'C': 3}, {'A': 1, 'B': 3, 'D': 5}]
df = df[['timestamp']].join(pd.DataFrame(a, index=df.index))
print (df)
timestamp A B C D
0 222 1 2 3.0 NaN
1 333 1 3 NaN 5.0
If many another columns use DataFrame.pop
for extract columns:
a = [dict(zip(*x)) for x in zip(df.pop('variable'), df.pop('value'))]
df = df.join(pd.DataFrame(a, index=df.index))
print (df)
timestamp A B C D
0 222 1 2 3.0 NaN
1 333 1 3 NaN 5.0
Upvotes: 3
Reputation: 323316
Using unnest first , then just pivot
unnesting(df,['variable','value']).pivot(*df.columns)
Out[79]:
variable A B C D
timestamp
222 1.0 2.0 3.0 NaN
333 1.0 3.0 NaN 5.0
def unnesting(df, explode):
idx = df.index.repeat(df[explode[0]].str.len())
df1 = pd.concat([
pd.DataFrame({x: np.concatenate(df[x].values)}) for x in explode], axis=1)
df1.index = idx
return df1.join(df.drop(explode, 1), how='left')
Upvotes: 3
Reputation: 28303
You can pass the values & column names to a the pd.Series constructor. This will automatically expand the values in your desired shape.
df.set_index('timestamp').apply(lambda row: pd.Series(row.value, index=row.variable), axis=1)
# outputs:
A B C D
timestamp
222 1.0 2.0 3.0 NaN
333 1.0 3.0 NaN 5.0
Upvotes: 2