Reputation: 135
This is a very straightforward question. I have and x axis of years and a y axis of numbers increasing linearly by 100. When plotting this with pandas and matplotlib I am given a graph that does not represent the data whatsoever. I need some help to figure this out because it is such a small amount of code:
The CSV is as follows:
A,B
2012,100
2013,200
2014,300
2015,400
2016,500
2017,600
2018,700
2012,800
2013,900
2014,1000
2015,1100
2016,1200
2017,1300
2018,1400
The Code:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
data = pd.read_csv("CSV/DSNY.csv")
data.set_index("A", inplace=True)
data.plot()
plt.show()
The graph this yields is:
It is clearly very inconsistent with the data - any suggestions?
Upvotes: 1
Views: 109
Reputation: 7416
All you need is sort A before plotting.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
data = pd.read_csv("CSV/DSNY.csv").reset_index()
data = data.sort_values('A')
data.set_index("A", inplace=True)
data.plot()
plt.show()
Upvotes: 1
Reputation: 11105
The default behaviour of matplotlib/pandas is to draw a line between successive data points, and not to mark each data point with a symbol.
Fix: change data.plot()
to data.plot(style='o')
, or df.plot(marker='o', linewidth=0)
.
Upvotes: 4