Reputation: 2141
I have found solutions to similar questions, but they all produce odd results.
I have a plot that looks like this:
generated using this code:
ax1 = dft.plot(kind='scatter',x='end_date',y='pct',c='fte_grade',colormap='Reds',colorbar=False,edgecolors='red',vmin=4,vmax=10)
ax1.set_xticklabels([datetime.datetime.fromtimestamp(ts / 1e9).strftime('%Y-%m-%d') for ts in ax1.get_xticks()])
dfb.plot(kind='scatter',x='end_date',y='pct',c='fte_grade',colormap='Blues',title='%s Polls'%state,ax=ax1,colorbar=False,edgecolors='blue',vmin=4,vmax=10)
plt.ylim(30,70)
plt.axhline(50,ls='--',alpha=0.5,color='grey')
plt.xticks(rotation=20)
Now, whenever I try to plot a line ontop of this, I get something like the following:
import matplotlib.pyplot as plt
import numpy as np
x = dft['pct']
u = dft['Trump Odds']
t = list(pd.to_datetime(dft['end_date']))
plt.hold(True)
plt.subplot2grid((1, 1), (0, 0))
plt.plot(t,x)
plt.scatter(t, u)
plt.show()
If it's not clear, this is not what I want. These dots represent individual polls, and I have data representing a line that aggregates the individual polls. I think this has something to do with datetimes and the possibility of multiple polls for a particular date in the polling. I think that the plotter is getting confused because I have double values for the same date, so it assumes this is not a time series, and when i plot a line, it maintains the assumption that we don't need any continuity.
There must be something within python that can handle drawing a time series on top of a time xaxis scatter plot right?
dft data:
end_date pct fte_grade Trump Odds
0 1598054400000000000 32.0 6 32.000000
1 1588550400000000000 32.0 7 32.000000
2 1582156800000000000 39.0 8 34.666667
3 1585180800000000000 33.0 8 34.206897
4 1587600000000000000 29.0 8 33.081081
5 1590019200000000000 32.0 8 33.025641
6 1559779200000000000 36.0 8 33.800000
7 1593043200000000000 32.0 8 32.400000
Upvotes: 0
Views: 841
Reputation: 2819
Is your str ange line is not due to the fact you didn't sort the df before to plot it:
import matplotlib.pyplot as plt
import numpy as np
dft=dft.sort_values(by=['end_date'])
x = dft['pct']
u = dft['Trump Odds']
t = list(pd.to_datetime(dft['end_date']))
plt.hold(True)
plt.subplot2grid((1, 1), (0, 0))
plt.plot(t,x)
plt.scatter(t, u)
plt.show()
Upvotes: 1