Hira Tanveer
Hira Tanveer

Reputation: 375

Why is df.loc giving empty dataframe?

So i have a xyz.csv like:

Date       |value | 
2013-01-03 | 13   | 
2013-02-22 | 14   | 
2014-03-03 | 16   | 
2011-04-03 | 15   | 
2013-05-03 | 13   | 
2013-06-06 | 14   | 
2014-08-03 | 16   | 
2011-09-03 | 15   | 
2013-10-03 | 13   | 
2013-11-06 | 14   |
2014-12-03 | 16   |

I want to split data from oct-march in one data frame, and april-sept in another (irrespective of the year).

My code:

import pandas as pd

df = pd.read_csv("xyz.csv")
pd.set_option('display.max_rows', None)
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')

oct_march = df.loc[(pd.DatetimeIndex(df['Date']).month<4) | (pd.DatetimeIndex(df['Date']).month>9)]
print(oct_march)

apr_sep = df.loc[(pd.DatetimeIndex(df['Date']).month<10) | (pd.DatetimeIndex(df['Date']).month>3)]
# print(apr_sep)

# oct_march.to_csv('oct_march.csv', index=False, encoding='utf-8')
# apr_sep.to_csv('apr_sep.csv', index=False, encoding='utf-8')

On printing 'oct_march' i get an empty dataframe. althought there are values in 'df' for the respective months.

My Output:

Empty DataFrame
Columns: [Date, value]
Index: []

Upvotes: 1

Views: 1034

Answers (2)

Soumendra Mishra
Soumendra Mishra

Reputation: 3653

You can try this:

df = pd.read_csv("file.csv", sep="|")
print(df)

dfAprSep = pd.DataFrame()
dfAprSep = df.loc[pd.DatetimeIndex(df['Date']).month.isin([4,5,6,7,8,9])]
print(dfAprSep)

dfOctMar = pd.DataFrame()
dfOctMar = df.loc[pd.DatetimeIndex(df['Date']).month.isin([10,11,12,1,2,3])]
print(dfOctMar)

Upvotes: 0

Trenton McKinney
Trenton McKinney

Reputation: 62403

  • Make certain the 'Date' column is a datetime format, by using pd.to_datetime
    • Use the .dt accessor with .month, to extract the month number from a datetime column.
    • pd.DatetimeIndex is not needed to extract month from a datetime column.
  • Use Boolean Indexing to mask and select the correct data by using .isin.
import pandas as pd

data = {'Date': ['2013-01-03', '2013-02-22', '2014-03-03', '2011-04-03', '2013-05-03', '2013-06-06', '2014-08-03', '2011-09-03', '2013-10-03', '2013-11-06', '2014-12-03'],
        'value': [13, 14, 16, 15, 13, 14, 16, 15, 13, 14, 16]}

df = pd.DataFrame(data)

# make sure Date is a datetime
df.Date= pd.to_datetime(df.Date)

# create a Boolean mask for Oct through March with .isin
# use the .dt accessor with a datetime column
om_mask = df.Date.dt.month.isin([1, 2, 3, 10, 11, 12])

# select only True from the mask
oct_mar = df[om_mask]

# select only False from the mask
apr_sep = df[~om_mask]

oct_mar

         Date  value
0  2013-01-03     13
1  2013-02-22     14
2  2014-03-03     16
8  2013-10-03     13
9  2013-11-06     14
10 2014-12-03     16

apr_sep

        Date  value
3 2011-04-03     15
4 2013-05-03     13
5 2013-06-06     14
6 2014-08-03     16
7 2011-09-03     15

Upvotes: 1

Related Questions