Reputation: 43
I was wondering why the x-axis plots the dates wrong, it begins at the 05/02 when it should start at the 30/01, and I'm not sure where it is I went wrong.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
cols = ['Time','Water Usage']
A = pd.read_csv("CSVFile", names=cols, parse_dates=[0])
plt.ylabel = "Time"
plt.xlabel = "Water Usage"
A.plot(x='Time',y='Water Usage')
plt.show()
The file is in the format
Date+Time | Usage
30/01/2018 | 50091
05/02/2018 | 50890
so ideally it should plot the 30/01 first followed by the 05/02, whereas currently its doing the opposite.
Upvotes: 0
Views: 1447
Reputation: 126
To make sure your program plots the x-values chronologically, you should convert the Date+Time
column into a datetime
object. I see you used parse_dates
in your read_csv
call, but the docs say it might not be 100% effective:
If a column or index cannot be represented as an array of datetimes, say because of an unparseable value or a mixture of timezones, the column or index will be returned unaltered as an object data type. For non-standard datetime parsing, use pd.to_datetime after pd.read_csv. To parse an index or column with a mixture of timezones, specify date_parser to be a partially-applied pandas.to_datetime() with utc=True. See Parsing a CSV with mixed timezones for more.
So I would try the following (to_datetime):
A['Time'] = pd.to_datetime(A['Time'])
A.sort_values(by='Time', inplace = True)
I hope it helps!
Upvotes: 2
Reputation: 7604
Just made a few changes to your code, mainly pd.to_datetime
and it works fine:
cols = ['Time','Water Usage']
df = pd.read_csv('test.csv', sep='|')
df.columns = cols
df['Time'] = pd.to_datetime(df['Time'], format='%d/%m/%Y')
plt.ylabel = "Time"
plt.xlabel = "Water Usage"
df.plot(x='Time',y='Water Usage')
plt.show()
Upvotes: 0