programmernoob
programmernoob

Reputation: 47

Using pandas dataframe and matplotlib to manipulate data from a csv file into a plot

Here is what I'm trying to do: build a dataframe that has a datetime index created from column 0. Use resample function over a quaterly period, create a plot that shows the quarterly precipitation total amounts over the 14 year period.

second plot make a plot of the average monthly precip and the monthly standard dev. Plot both values on the same axes.

Here's my code so far:

    %matplotlib inline
    import pandas as pd
    import numpy as np
    import matplotlib
    import matplotlib.pyplot as plt
    plt.style.use('seaborn-whitegrid')
    matplotlib.rcParams['figure.figsize'] = (10.0, 4.0)

    df = pd.read_csv("ColumbusPrecipData.csv")
    df.set_index("date", inplace = True)
    #df['date'] = pd.to_datetime(df[['']])

    print(df)
    #build plots
    #axes = plt.subplot()

    #start = pd.to_datetime
    #end = pd.to_datetime
    #axes.set_xlim(start, end)
    #axes.set_title("")
    #axes.set_ylabel("")
    #axes.tick_params(axis='x', rotation=45)
    #axes.legend(loc='best')

Here's what the dataframe looks like:

                        Unnamed: 0  Precip
       0       2000-01-01 01:00:00     0.0
       1       2000-01-01 02:00:00     0.0
       2       2000-01-01 03:00:00     0.0
       3       2000-01-01 04:00:00     0.0
       4       2000-01-01 05:00:00     0.0
       5       2000-01-01 06:00:00     0.0
       6       2000-01-01 07:00:00     0.0
       7       2000-01-01 08:00:00     0.0
       8       2000-01-01 09:00:00     0.0
       9       2000-01-01 10:00:00     0.0
       10      2000-01-01 11:00:00     0.0
       11      2000-01-01 12:00:00     0.0
       12      2000-01-01 13:00:00     0.0
       13      2000-01-01 14:00:00     0.0
       14      2000-01-01 15:00:00     0.0
       15      2000-01-01 16:00:00     0.0
       16      2000-01-01 17:00:00     0.0
       17      2000-01-01 18:00:00     0.0
       18      2000-01-01 19:00:00     0.0
       19      2000-01-01 20:00:00     0.0
       20      2000-01-01 21:00:00     0.0
       21      2000-01-01 22:00:00     0.0
       22      2000-01-01 23:00:00     0.0
       23      2000-01-02 00:00:00     0.0
       24      2000-01-02 01:00:00     0.0
       25      2000-01-02 02:00:00     0.0
       26      2000-01-02 03:00:00     0.0
       27      2000-01-02 04:00:00     0.0
       28      2000-01-02 05:00:00     0.0
       29      2000-01-02 06:00:00     0.0
       ...                     ...     ...
       122696  2013-12-30 09:00:00     0.0
       122697  2013-12-30 10:00:00     0.0
       122698  2013-12-30 11:00:00     0.0
       122699  2013-12-30 12:00:00     0.0
       122700  2013-12-30 13:00:00     0.0
       122701  2013-12-30 14:00:00     0.0
       122702  2013-12-30 15:00:00     0.0
       122703  2013-12-30 16:00:00     0.0
       122704  2013-12-30 17:00:00     0.0
       122705  2013-12-30 18:00:00     0.0
       122706  2013-12-30 19:00:00     0.0
       122707  2013-12-30 20:00:00     0.0
       122708  2013-12-30 21:00:00     0.0
       122709  2013-12-30 22:00:00     0.0
       122710  2013-12-30 23:00:00     0.0
       122711  2013-12-31 00:00:00     0.0
       122712  2013-12-31 01:00:00     0.0
       122713  2013-12-31 02:00:00     0.0
       122714  2013-12-31 03:00:00     0.0
       122715  2013-12-31 04:00:00     0.0
       122716  2013-12-31 05:00:00     0.0
       122717  2013-12-31 06:00:00     0.0
       122718  2013-12-31 07:00:00     0.0
       122719  2013-12-31 08:00:00     0.0
       122720  2013-12-31 09:00:00     0.0
       122721  2013-12-31 10:00:00     0.0
       122722  2013-12-31 11:00:00     0.0
       122723  2013-12-31 12:00:00     0.0
       122724  2013-12-31 13:00:00     0.0
       122725  2013-12-31 14:00:00     0.0

       [122726 rows x 2 columns]

Upvotes: 0

Views: 63

Answers (1)

oreopot
oreopot

Reputation: 3450

df = df.rename( columns={"Unnamed: 0": "date"})
df = df.set_index(pd.DatetimeIndex(df['date']))

Then

df1 = df.groupby(pd.Grouper(freq='M')).mean()


plt.plot(df1)

Upvotes: 1

Related Questions