Reputation: 59
I have a csv file filled with 10 years of data, and I would like to extract two specific values from the year 2018 that is contained in that large file.
df = pd.read_csv(fn, parse_dates=[0])
df = df.set_index('date')
This was done to extract the last trading day of each month in 2018: (only the dates came out as output, with no other data)
start_date = '2018-01-01'
end_date = '2018-12-31'
months_of_2018 = pd.date_range(start_date, end_date, freq='BM')
print(months_of_2018)
I want to extract the data of the june and september values belonging to s_001. How can I achieve that? Below is an example of how a part of my csv file looks like.
date s_001 s_002
2018-01-31 0.0031 0.0025
2018-02-28 0.0024 0.0012
2018-03-30 0.0021 -0.0011
2018-04-30 -0.0010 0.0023
2018-05-31 0.0012 0.0028
2018-06-29 0.0015 0.0020
2018-07-31 0.0025 0.0015
2018-08-31 0.0017 0.0003
2018-09-28 -0.0025 0.0009
2018-10-31 0.0030 0.0013
2018-11-30 0.0018 0.0005
2018-12-31 0.0019 0.0010
Upvotes: 0
Views: 911
Reputation: 862431
Use DatetimeIndex.month
with Index.isin
and DataFrame.loc
for filtering by multiple month
s:
df = pd.read_csv(fn, index_col=[0], parse_dates=[0])
s = df.loc[df.index.month.isin([6,9]) & (df.index.year==2008), 's_001']
print (s)
date
2018-06-29 0.0015
2018-09-28 -0.0025
Name: s_001, dtype: float64
Or:
s = df.loc[df.index.strftime('%m-%Y').isin(['06-2018','09-2018']), 's_001']
Upvotes: 2