VB_WA
VB_WA

Reputation: 13

Multiple timeseries plots from Pandas Dataframe

I am attempting to write my first python script using pandas. I have 10 years of wind data (1min readings) that i need to create monthly plots with the speed and direction plotted on each plot.

The input csv data looks like this:

Date,Speed,Dir,
2014-01-01 00:00:00, 13, 179,
2014-01-01 00:01:00, 13, 178,
2014-01-01 00:02:00, 11, 169,
2014-01-01 00:03:00, 11, 178,
2014-01-01 00:04:00, 11, 181,

So far i have written the below, this creates a plot for a month set in the date range. I am generally happy with how this plot looks except i need to fix the x axis labels.

I would like to loop through the whole dataset and create a pdf plot for each month. Any help with doing this would be appreciated!

import glob, os
import pandas as pd
from pandas import Series, DataFrame, Panel
import numpy as np
import matplotlib.pyplot as plt

wind = pd.read_csv('2014.csv')

wind['Date']=pd.to_datetime(wind['Date'])
wind=wind.set_index('Date')

dates = pd.date_range('2014-01', '2014-2', freq='1min')

janwin = Series(wind['Speed'], index=dates)
jandir = Series(wind['Dir'], index=dates)

plt.figure(1)
plt.subplot(211)
plt.plot(dates, janwin)

plt.ylabel("Km/hr")
plt.rcParams.update({'font.size': 4})
plt.grid(which='major', alpha = .5)


plt.subplot(212)
plt.plot(dates, jandir)
plt.ylabel("Degrees")
plt.rcParams.update({'font.size': 4})
plt.grid(which='major', alpha = 5)
plt.ylim(0,360)
plt.axis(minor=True) 

plt.savefig('test.pdf', dpi=900)

Sample Plot

Upvotes: 0

Views: 1803

Answers (2)

VB_WA
VB_WA

Reputation: 13

Many thanks to flyingmeatball for showing me how to loop through the data. I learnt a lot working through my first script, hopefully its a useful reference for somebody else.

The code below reads in a csv containing 1min averaged wind and direction data with a date/time field and plots a figure containing a time series for both speed and direction for each month.

Edit: Since posting this i have noticed that the below plots data to the first time stamp of the last day of the month (missing ~24 hours of data). This is because CurrMoEnd returns a date only.

#Plots monthly wind speed data from 1min average recordings to PDF
import pandas as pd
import matplotlib.pyplot as plt
import datetime
from dateutil.relativedelta import relativedelta
import calendar


data = pd.read_csv('data.csv')

data['Date']=pd.to_datetime(data['Date'])

rawDf = pd.DataFrame(data, columns = ['Date','Speed','Dir'])


#Define beginning and end of loop - start at first month, end at last month
currDate = datetime.date(rawDf['Date'].min().year, rawDf['Date'].min().month, 1)
endDate = datetime.date(rawDf['Date'].max().year, rawDf['Date'].max().month, 1)


#loop through and plot each month of data
while currDate <= endDate:

currMoEnd = datetime.date(currDate.year, currDate.month, calendar.monthrange(currDate.year,currDate.month)[1])
wind = rawDf[(rawDf['Date']>= currDate) & (rawDf['Date']<= currMoEnd)]
wind.set_index('Date', inplace = True)

dates = pd.date_range(currDate, currMoEnd, freq='1min')

win = pd.Series(wind['Speed'], index=dates)
dirc = pd.Series(wind['Dir'], index=dates)

#Set figure size roughly to A4 paper size
plt.figure(1, figsize = (11.3, 8))

plt.subplot(211)
plt.plot(dates, win, lw = 0.15)
plt.ylabel("Km/hr")
plt.rcParams.update({'font.size': 4})
plt.grid(which='major')


plt.subplot(212)
plt.plot(dates, dirc, lw = 0.15)
plt.ylabel("Degrees")
plt.rcParams.update({'font.size': 4})
plt.grid(which='major')
plt.yticks([0, 45, 90, 135, 180, 225, 270, 315, 360])
plt.ylim(0,360)
plt.axis(minor=True) 

#convert current month to for file name
month = int(currDate.strftime('%m'))
year= int(currDate.strftime('%Y'))

#Plot PDF to current directory/year/month output.pdf
plt.savefig("{}/{} Output.pdf".format(year, month), dpi = 900)
plt.show()

#increment current date    
currDate = currDate + relativedelta(months = 1)

Upvotes: 1

flyingmeatball
flyingmeatball

Reputation: 7997

Welcome to Stackoverflow. Typically when you're asking for assistance with this kind of problem it's best to work until you get stuck at a particular instance/issue and then ask for help. It's very hard to tell you how to do something this broad, and often you won't get a good response, as it seems like you're just being lazy and asking for help instead of trying all the way through to a problem. I see a number of issues you need to tackle, but broadly you need to setup a loop and figure out how to start/stop the loop, and how to only plot the data for the month you're currently interested in.

Below is some sample code I wrote quickly from memory (hasn't been run), I'm sure there is a better way to do this, but hopefully it will get you on the right track. In the future, you'll get the best responses if you can distill your post down to the basic parts. In this case, a sample dataframe of two months daily would have been helpful to get the iteration/plotting down. You can then take the working code and adjust to minute.

If this is helpful please thumbs up and work to make sure the final code listed here is useful to those that follow you.

import pandas as pd
import matplotlib.pyplot as plt
import datetime
from dateutil.relativedelta import relativedelta
import calendar

#wind = pd.read_csv('2014.csv')
data = [['2014-01-01 00:00:00', 13, 179],
        ['2014-01-01 00:01:00', 13, 178],['2014-01-01 00:02:00', 11, 169],['2014-01-01 00:03:00', 11, 178], 
        ['2014-01-01 00:04:00', 11, 181]]

rawDf = pd.DataFrame(data, columns = ['Date','Speed','Dir'])

rawDf['Date']=pd.to_datetime(rawDf['Date'])

#Define beginning and end of loop - start at first month, end at last month
currDate = datetime.date(rawDf['Date'].min().year, rawDf['Date'].min().month, 1)
endDate = datetime.date(rawDf['Date'].max().year, rawDf['Date'].max().month, 1)


#loop
while currDate <= endDate:

    currMoEnd = datetime.date(currDate.year, currDate.month, calendar.monthrange(currDate.year,currDate.month)[1])
    wind = rawDf[(rawDf['Date']>= currDate) & (rawDf['Date']<= currMoEnd)]
    wind.set_index('Date', inplace = True)

    dates = pd.date_range(currDate, currMoEnd, freq='1min')

    janwin = pd.Series(wind['Speed'], index=dates)
    jandir = pd.Series(wind['Dir'], index=dates)

    plt.figure(1)
    plt.subplot(211)
    plt.plot(dates, janwin)

    plt.ylabel("Km/hr")
    plt.rcParams.update({'font.size': 4})
    plt.grid(which='major', alpha = .5)


    plt.subplot(212)
    plt.plot(dates, jandir)
    plt.ylabel("Degrees")
    plt.rcParams.update({'font.size': 4})
    plt.grid(which='major', alpha = 5)
    plt.ylim(0,360)
    plt.axis(minor=True) 

    plt.show()
    plt.savefig('{0}_output.pdf'.format(datetime.stftime(currDate,'%Y-%m')),  dpi=900)

    currDate = currDate + relativedelta(months = 1)

Upvotes: 1

Related Questions