Simon
Simon

Reputation: 333

how to draw 'historical' data in python

good afternoon,

I can't find a way to draw/graph historical data (time series) and request you feedback with my problem.

I have a csv file containing a history of pids based on some criteria. The file looks like this:

|2019-12-13 14:00:00| 123456 |
|2019-12-13 14:00:00| 345678 |
|2019-12-13 14:00:20| 123456 |
|2019-12-13 14:00:20| 345678 |
|2019-12-13 14:00:40| 123456 |
|2019-12-13 14:00:40| 345678 |
|2019-12-13 14:00:40| 678123 |
|2019-12-13 14:01:00| 123456 |
|2019-12-13 14:01:00| 678123 |

so we have:

I'd like to draw a line-graph with X-axis begin my timestamp and Y-xis my pids to see the creation/death of my processes within my timeframe.

I'd start with storing my data in a pandas dataframe but then I don't know how to move forward.

Any recommandation to help me continue?

Thanks in advance

Upvotes: 1

Views: 1035

Answers (2)

kantal
kantal

Reputation: 2407

Group by the 'pid', then in the groups set the time as the index, and rename the column to the value of pid. Then concatenate the resulted data frames:

  r=[ grp.set_index("time") \
         .assign(pid=idx) \
         .rename(columns={"pid":pid}) \
      for idx,(pid,grp) in enumerate(df.groupby("pid"),1) ]

e.g.: r[0]                                                                                                                
                     123456
time                       
2019-12-13 14:00:00       1
2019-12-13 14:00:20       1
2019-12-13 14:00:40       1
2019-12-13 14:01:00       1

#rslt=pd.concat(r,axis=1).fillna(0).astype(int)
rslt=pd.concat(r,axis=1) 



                     123456  345678  678123
time                                       
2019-12-13 14:00:00       1       2       0
2019-12-13 14:00:20       1       2       0
2019-12-13 14:00:40       1       2       3
2019-12-13 14:01:00       1       0       3

# from matplotlib import pylab as plt
rslt.plot()                                                                                                          
plt.show()

Upvotes: 2

Quang Hoang
Quang Hoang

Reputation: 150805

Here's what I would do:

# for shifting and naming lines
codes, names = df['pid'].factorize()

ax = (df.assign(pid_name=codes)
   .pivot(index='timestamp', columns='pid_name', values='pid')
   .plot()
)

# rename legend
h,l = ax.get_legend_handles_labels()
ax.legend(h, names)

Output:

enter image description here

Upvotes: 3

Related Questions