Reputation: 25
I have a pandas dataframe that has four fields 'EventDate', 'DataField', 'DataValue'.
'DataField' has three values i.e Oxygen, HeartRate, HeartRateVariability.
How to change the above format into following format for analysis?
Upvotes: 0
Views: 207
Reputation: 2061
As suggested by others, you can use the pivot_table() function, specifically for your case you can try this:
pivot_df = df.pivot_table( index = 'EventDate', columns = 'DataField', values = 'DataValue')
Upvotes: 0
Reputation: 643
Any time you want to take one attribute in your dataset and group some other attributes by it, you should think about using pandas group_by or pivot_table functionality.
I'm personally a fan of pivot tables, so here is how do it in a pivot table:
# Pivot the data
pivot_table = df.pivot_table(
index=['EventDate'],
values=['Oxygen', 'HeartRate', 'HeartRateVariability],
aggfunc={'Oxygen': 'mean', 'HeartRate': 'mean', 'HeartRateVariability': 'mean'}
)
By specifying the aggfunc as mean, if there are any EventDates that have multiple records, the resulting pivot table will have the mean of those records listed.
If creating pivot tables is something you do often, you could also checkout some pandas pivot table GUIs. I'm the creator of one called Mito. Mito is an extension to Jupyter Lab and it lets you create pivot tables (and other spreadsheet style analyses) in a point and click way. Each time you make an edit in the Mito spreadsheet, it automatically generates the equivalent pandas code for you.
Upvotes: 1
Reputation: 14949
you can try pivot
:
df = df.pivot(*df).fillna('')
For more info -> you can check pivot & pivot_table
Upvotes: 1