Tom Crews
Tom Crews

Reputation: 135

Simple Graph Does Not Represent Data

This is a very straightforward question. I have and x axis of years and a y axis of numbers increasing linearly by 100. When plotting this with pandas and matplotlib I am given a graph that does not represent the data whatsoever. I need some help to figure this out because it is such a small amount of code:

The CSV is as follows:

A,B
2012,100
2013,200
2014,300
2015,400
2016,500
2017,600
2018,700
2012,800
2013,900
2014,1000
2015,1100
2016,1200
2017,1300
2018,1400

The Code:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

data = pd.read_csv("CSV/DSNY.csv")

data.set_index("A", inplace=True)


data.plot()
plt.show()

The graph this yields is:

graph from CSV data

It is clearly very inconsistent with the data - any suggestions?

Upvotes: 1

Views: 109

Answers (2)

Hari_pb
Hari_pb

Reputation: 7416

All you need is sort A before plotting.

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

data = pd.read_csv("CSV/DSNY.csv").reset_index()
data = data.sort_values('A')
data.set_index("A", inplace=True)


data.plot()
plt.show()

enter image description here

Upvotes: 1

Peter Leimbigler
Peter Leimbigler

Reputation: 11105

The default behaviour of matplotlib/pandas is to draw a line between successive data points, and not to mark each data point with a symbol.

Fix: change data.plot() to data.plot(style='o'), or df.plot(marker='o', linewidth=0).

Result: result of style='o'

Upvotes: 4

Related Questions