Reputation: 375
So i have a xyz.csv like:
Date |value |
2013-01-03 | 13 |
2013-02-22 | 14 |
2014-03-03 | 16 |
2011-04-03 | 15 |
2013-05-03 | 13 |
2013-06-06 | 14 |
2014-08-03 | 16 |
2011-09-03 | 15 |
2013-10-03 | 13 |
2013-11-06 | 14 |
2014-12-03 | 16 |
I want to split data from oct-march in one data frame, and april-sept in another (irrespective of the year).
My code:
import pandas as pd
df = pd.read_csv("xyz.csv")
pd.set_option('display.max_rows', None)
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
oct_march = df.loc[(pd.DatetimeIndex(df['Date']).month<4) | (pd.DatetimeIndex(df['Date']).month>9)]
print(oct_march)
apr_sep = df.loc[(pd.DatetimeIndex(df['Date']).month<10) | (pd.DatetimeIndex(df['Date']).month>3)]
# print(apr_sep)
# oct_march.to_csv('oct_march.csv', index=False, encoding='utf-8')
# apr_sep.to_csv('apr_sep.csv', index=False, encoding='utf-8')
On printing 'oct_march' i get an empty dataframe. althought there are values in 'df' for the respective months.
My Output:
Empty DataFrame
Columns: [Date, value]
Index: []
Upvotes: 1
Views: 1034
Reputation: 3653
You can try this:
df = pd.read_csv("file.csv", sep="|")
print(df)
dfAprSep = pd.DataFrame()
dfAprSep = df.loc[pd.DatetimeIndex(df['Date']).month.isin([4,5,6,7,8,9])]
print(dfAprSep)
dfOctMar = pd.DataFrame()
dfOctMar = df.loc[pd.DatetimeIndex(df['Date']).month.isin([10,11,12,1,2,3])]
print(dfOctMar)
Upvotes: 0
Reputation: 62403
'Date'
column is a datetime
format, by using pd.to_datetime
.dt
accessor with .month
, to extract the month number from a datetime column.pd.DatetimeIndex
is not needed to extract month from a datetime column..isin
.
isin
~
is a logical not
..loc
is not needed with Boolean indexing.import pandas as pd
data = {'Date': ['2013-01-03', '2013-02-22', '2014-03-03', '2011-04-03', '2013-05-03', '2013-06-06', '2014-08-03', '2011-09-03', '2013-10-03', '2013-11-06', '2014-12-03'],
'value': [13, 14, 16, 15, 13, 14, 16, 15, 13, 14, 16]}
df = pd.DataFrame(data)
# make sure Date is a datetime
df.Date= pd.to_datetime(df.Date)
# create a Boolean mask for Oct through March with .isin
# use the .dt accessor with a datetime column
om_mask = df.Date.dt.month.isin([1, 2, 3, 10, 11, 12])
# select only True from the mask
oct_mar = df[om_mask]
# select only False from the mask
apr_sep = df[~om_mask]
oct_mar
Date value
0 2013-01-03 13
1 2013-02-22 14
2 2014-03-03 16
8 2013-10-03 13
9 2013-11-06 14
10 2014-12-03 16
apr_sep
Date value
3 2011-04-03 15
4 2013-05-03 13
5 2013-06-06 14
6 2014-08-03 16
7 2011-09-03 15
Upvotes: 1