Reputation: 73
I am new to using python pandas, and have the below script to pull in time series data from an excel file, set the dates = index, and then will want to perform various calculations on the data referencing by date. Script:
df = pd.read_excel("myfile.xls")
df = df.set_index(df.Date)
df = df.drop("Date",1)
df.index.name = None
df.head()
The output of that (to give you a sense of the data) is:
Px1 Px2 Px3 Px4 Px5 Px6 Px7
2015-08-12 19.850000 10.25 7.88 10.90 109.349998 106.650002 208.830002
2015-08-11 19.549999 10.16 7.81 10.88 109.419998 106.690002 208.660004
2015-08-10 19.260000 10.07 7.73 10.79 109.059998 105.989998 210.630005
2015-08-07 19.240000 10.08 7.69 10.92 109.199997 106.430000 207.919998
2015-08-06 19.250000 10.09 7.76 10.96 109.010002 106.010002 208.350006
When I try to retrieve data based on one date like df.loc['20150806']
that works, but when I try to retrieve a slice like df.loc['20150806':'20150812']
I return Empty DataFrame
.
Again, the index is a DateTimeIndex with dtype = 'datetime64[ns]', length = 1412, freq = None, tz = None
Like I said, my ultimate goal is to be able to group the data by Day, Month, Year, different periods etc., and perform calculations on the data. I want to give that context, but don't even want to get into that here since I'm clearly stuck on something more basic - perhaps misunderstanding how to operate with a DateTimeIndex
Thank you.
EDIT: Meant to also include, I think the main problem I referenced with indexing has something to do with freq=0, bc when I tried simpler examples with contiguous date series, I did not have this problem.
Upvotes: 5
Views: 9557
Reputation: 73
df.loc['2015-08-12':'2015-08-10']
and df.loc['2015-08-10':'2015-08-12':-1]
both work. df = df.sort_index()
and slicing the way I was trying also works. Thank you all. Was missing the forest for the trees there I think.
Upvotes: 1