Reputation: 1002
I have date in one column and time in another which I retrieved from database through pandas read_sql. The dataframe looks like below (there are 30 -40 rows in my daaframe). I want to plot them in a time series graph. If I want I should be in a position to convert that to Histogram as well.
COB CALV14
1 2019-10-04 07:04
2 2019-10-04 05:03
3 2019-10-03 16:03
4 2019-10-03 05:15
First I got different errors - like not numeric field to plot etc. After searching a lot,the closest post I could find is : Matplotlib date on y axis
I followed and got some result - However the problem is:
I have to follow number of steps (convert to str then list and then to matplot lib datetime format) before I can plot them. (Please refer the code I am using) There must be a smarter and more precise way to do this. This does not show the time beside the axis the way they exactly appear in the data frame. (eg it should show 07:03, 05:04 etc)
New to python - will appreciate any help on this.
Code
ob_frame['COB'] = ob_frame.COB.astype(str)
ob_frame['CALV14'] = ob_frame.CALV14.astype(str)
date = ob_frame.COB.tolist()
time = ob_frame.CALV14.tolist()
y = mdates.datestr2num(date)
x = mdates.datestr2num(time)
fig, ax = plt.subplots(figsize=(9,9))
ax.plot(x, y)
ax.yaxis_date()
ax.xaxis_date()
fig.autofmt_xdate()
plt.show()
Upvotes: 2
Views: 2062
Reputation: 1002
I found the answer to it.I did not need to convert the data retrieved from DB to string type. Rest of the issue I was thought to be getting for not using the right formatting for the tick labels. Here goes the complete code - Posting in case this will help anyone. In this code I have altered Y and X axis : i:e I plotted dates in x axis and time in Y axis as it looked better.
###### Import all the libraries and modules needed ######
import IN_OUT_SQL as IS ## IN_OUT_SQL.py is the file where the SQL is stored
import cx_Oracle as co
import numpy as np
import Credential as cd # Credentia.py is the File Where you store the DB credentials
import pandas as pd
from matplotlib import pyplot as plt
from matplotlib import dates as mdates
%matplotlib inline
###### Connect to DB, make the dataframe and prepare the x and y values to be plotted ######
def extract_data(query):
'''
This function takes the given query as input, Connects to the Databse, executes the SQL and
returns the result in a dataframe.
'''
cred = cd.POLN_CONSTR #POLN_CONSTR in the credential file stores the credential in '''USERNAME/PASSWORD@DB_NAME''' format
conn = co.connect(cred)
frame = pd.read_sql(query, con = conn)
return frame
query = IS.OUT_SQL
ob_frame = extract_data(query)
ob_frame.dropna(inplace = True) # Drop the rows with NaN values for all the columns
x = mdates.datestr2num(ob_frame['COB']) #COB is date in "01-MAR-2020" format- convert it to madates type
y = mdates.datestr2num(ob_frame['CALV14']) #CALV14 is time in "21:04" Format- convert it to madates type
###### Make the Timeseries plot of delivery time in y axis vs delivery date in x axis ######
fig, ax = plt.subplots(figsize=(15,8))
ax.clear() # Clear the axes
ax.plot(x, y, 'bo-', color = 'dodgerblue') #Plot the data
##Below two lines are to draw a horizontal line for 05 AM and 07 AM position
plt.axhline(y = mdates.date2num (pd.to_datetime('07:00')), color = 'red', linestyle = '--', linewidth = 0.75)
plt.axhline(y = mdates.date2num (pd.to_datetime('05:00')), color = 'green', linestyle = '--', linewidth = 0.75)
plt.xticks(x,rotation = '75')
ax.yaxis_date()
ax.xaxis_date()
#Below 6 lines are about setting the format with which I want my xor y ticks and their labels to be displayed
yfmt = mdates.DateFormatter('%H:%M')
xfmt = mdates.DateFormatter('%d-%b-%y')
ax.yaxis.set_major_formatter(yfmt)
ax.xaxis.set_major_formatter(xfmt)
ax.yaxis.set_major_locator(mdates.HourLocator(interval=1)) # Every 1 Hour
ax.xaxis.set_major_locator(mdates.DayLocator(interval=1)) # Every 1 Day
####### Name the x,y labels, titles and beautify the plot #######
plt.style.use('bmh')
plt.xlabel('\nCOB Dates')
plt.ylabel('Time of Delivery (GMT/BST as applicable)\n')
plt.title(" Data readiness time against COBs (Last 3 months)\n")
plt.rcParams["font.size"] = "12" #Change the font
# plt.rcParams["font.family"] = "Times New Roman" # Set the font type if needed
plt.tick_params(left = False, bottom = False, labelsize = 10) #Remove ticks, make tick labelsize 10
plt.box(False)
plt.show()
Output:
Upvotes: 2