Reputation: 1008
My data looks like this :
900324492 900405679 900472531
1 2017-04-03 08:04:09 2017-04-03 07:49:53 2017-04-03 07:52:39
2 2017-04-03 08:05:36 2017-04-03 07:54:36 2017-04-03 07:52:19
3 2017-04-03 08:05:28 2017-04-03 07:43:00 2017-04-03 07:50:52
4 2017-04-03 08:06:05 2017-04-03 07:49:42 2017-04-03 07:53:55
So, for each column, I have a set of time stamps (datetime objects, to be exact). I like to make a scatter plot, where x is the df index or row number (i.e. x=[1,2,3,4,...])
, and y is a time point. For example, If there are 4 rows and 10 columns in df, x axis should be 1, 2, 3, 4
, and
for x=1
there should be one point per entry in the first row.
It seemed like a simple task, but I'm struggling a bit. My code so far:
df = pd.read_csv('test.csv')
df2 = df.apply(lambda x : pd.to_datetime(x))
fig = plt.figure()
ax = fig.add_subplot(111)
y = df2.ix[:, 1]
x = df2.index.values
# returns nonsense
ax.plot(x,y)
# TypeError: invalid type promotion
ax.scatter(x=x, y = df2.ix[:,1])
# TypeError: Empty 'DataFrame': no numeric data to plot
df2.ix[:,1].plot()
Test file link : test.csv
Upvotes: 0
Views: 542
Reputation: 7316
Please check my example from yours. You should focus on to_pydatetime()
and date2num()
and np.nan
. (You have to tag y axis to datetime format finally.)
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as dates
df = pd.read_csv('test.csv', header=None)
df2 = df.apply(lambda x : pd.to_datetime(x))
fig = plt.figure()
ax = fig.add_subplot(111)
y = df2.ix[:, 1]
x = df2.index.values
def fix(x):
try:
return dates.date2num(x.to_pydatetime())
except:
return np.nan
y_lab = [str(e) for e in y]
y_ = [fix(e) for e in y]
ax.scatter(x=x, y=y_)
plt.yticks(y_, y_lab, rotation=0, size=6)
plt.show()
Upvotes: 1