Reputation: 23
I'm having trouble loading data into a dataframe in the correct format.
The data I'm working with has the form:
Sensor Name,Variable,Units,Timestamp,Value,Flagged as Suspect Reading
PER_EMOTE_2202,CO,ugm -3,2019-02-15 15:46:16,476.634102929914,False
PER_EMOTE_2202,Humidity,%,2019-02-15 15:46:16,49.8,False
PER_EMOTE_2202,NO,ugm -3,2019-02-15 15:46:16,68.5581902781,False
PER_EMOTE_2202,NO2,ugm -3,2019-02-15 15:46:16,80.220623065752,False
PER_EMOTE_2202,Sound,db,2019-02-15 15:46:16,69.0,False
PER_EMOTE_2202,Temperature,Celsius,2019-02-15 15:46:16,13.2,False
PER_EMOTE_2202,CO,ugm -3,2019-02-15 15:47:16,475.363796635848,False
PER_EMOTE_2202,NO,ugm -3,2019-02-15 15:47:16,84.7920567981415,False
PER_EMOTE_2202,NO2,ugm -3,2019-02-15 15:47:16,82.3021250062142,False
PER_EMOTE_2202,Sound,db,2019-02-15 15:47:16,69.0,False
I'd like to load this into columns representing the value for each variable, indexed by timestamp.
For example; Desired Format
So far I have the following:
def load_data(path):
df = pd.read_csv(path, usecols=['Timestamp', 'Variable', 'Units', 'Value'])
df = df.set_index('Timestamp')
return df
This produces the result; Result
Upvotes: 2
Views: 48
Reputation: 325
Try using pivot tables.
def load_data(path):
df = pd.read_csv(path, usecols=['Timestamp', 'Variable', 'Units', 'Value'])
df = df.pivot_table(index='Timestamp', columns='Variable', values='Value')
return df
Upvotes: 1