Saa17
Saa17

Reputation: 265

Using numpy, how do you calculate snowfall per month?

I have a data set with snowfall records per day for one year. Date variable is in YYYYMMDD form.

Date      Snow
20010101  0
20010102  10
20010103  5
20010104  3
20010105  0
...
20011231  0

The actual data is here

https://github.com/emily737373/emily737373/blob/master/COX_SNOW-1.csv

I want to calculate the number of days it snowed each month. I know how to do this with pandas, but for a school project, I need to do it only using numpy. I can not import datetime either, it must be done only using numpy.

The output should be in this form

    Month     # days snowed
    January   13   
    February  19
    March     20
    ...
    December  15

My question is how do I only count the number of days it snowed (basically when snow variable is not 0) without having to do it separately for each month?

Upvotes: 0

Views: 146

Answers (1)

Ralubrusto
Ralubrusto

Reputation: 1501

I hope you can use some built-in packages, such as datetime, cause it's useful when working with datetime objects.

import numpy as np
import datetime as dt

df = np.genfromtxt('test_files/COX_SNOW-1.csv', delimiter=',', skip_header=1, dtype=str)

date = np.array([dt.datetime.strptime(d, "%Y%m%d").month for d in df[:, 0]])
snow = df[:, 1].copy().astype(np.int32)

has_snowed = snow > 0

for month in range(1, 13):
    month_str = dt.datetime(year=1, month=month, day=1).strftime('%B')
    how_much_snow = len(snow[has_snowed & (date == month)])
    print(month_str, ':', how_much_snow)

I loaded the data as str so we guarantee we can parse the Date column as dates later on. That's why we also need to explicitly convert the snow column to int32, otherwise the > comparison won't work.

The output is as follows:

January : 13
February : 19
March : 20
April : 13
May : 8
June : 9
July : 2
August : 7
September : 9
October : 19
November : 16
December : 15

Let me know if this worked for you or if you have any further questions.

Upvotes: 2

Related Questions