Ratnesh
Ratnesh

Reputation: 1700

Plotting histogram for column by grouping two column in pandas

I am new to pandas and matplotlib. I have a csv file which consist of year from 2012 to 2018. For each month of the year, I have Rain data. I want to analyze by the histogram, which month of the year having maximum rainfall. Here is my dataset.

year    month  Temp Rain
2012    1       10  100
2012    2       20  200
2012    3       30  300
..      ..      ..  ..
2012    12      40  400
2013    1       50  300
2013    2       60  200
..      ..      ..  ..
2018    12      70  400

I could not able to plot with histogram, I tried plotting with the bar but not getting desired result. Here what I have tried:

import pandas as pd
import numpy as npy
import matplotlib.pyplot as plt
df2=pd.read_csv('Monthly.csv')
df2.groupby(['year','month'])['Rain'].count().plot(kind="bar",figsize=(20,10))

Here what I got output: enter image description here

Please suggest me an approach to plot an histogram to analyze maxmimum rainfall happening in which month grouped by year.

Upvotes: 0

Views: 1253

Answers (3)

DataBach
DataBach

Reputation: 1633

First groubby year and month as you already did, but only keep the maximum rainfall.

series_df2 = df2.groupby(['year','month'], sort=False)['Rain'].max()

Then unstack the series, transpose it and plot it.

series_df2.unstack().T.plot(kind='bar', subplots=False, layout=(2,2))

This will give you an output that looks like this for your sample data:

enter image description here

Upvotes: 0

JoergVanAken
JoergVanAken

Reputation: 1286

Probably you don't want to see the count per group but

df2.groupby(['year','month'])['Rain'].first().plot(kind="bar",figsize=(20,10))

or maybe

df2.groupby(['month'])['Rain'].sum().plot(kind="bar",figsize=(20,10))

Upvotes: 1

Frenchy
Frenchy

Reputation: 17007

you are closed to solution, i'll write: use max() and not count()

df2.groupby(['year','month'])['Rain'].max().plot(kind="bar",figsize=(20,10))

Upvotes: 1

Related Questions