user1234440
user1234440

Reputation: 23567

Subset Pandas Dataframe Everything up to a Date

I have a dataframe with the following data:

            ACCOCI_ARQ  ASSETSAVG_ART  ASSETSC_ARQ  ASSETSNC_ARQ  ASSETS_ARQ  
Date                                                                           
2004-02-10   -31000000     6647000000   6029000000     942000000  6971000000   
2004-03-27         NaN            NaN          NaN           NaN         NaN   
2004-05-06   -10000000     6740500000   5784000000     951000000  6735000000   
2004-06-26         NaN            NaN          NaN           NaN         NaN   
2004-08-05   -18000000     6936000000   6286000000     937000000  7223000000  

I am given a date in pandas own Timestamp object. This date may or may not be inside the dataframe. How do you go ahead and subset the dataframe to get everything up to that specific date? (also how to do you so up to but not including that date?

I tried a whole sort of different ways like .ix, .iloc, etc but wasn't able to get one working.

Upvotes: 3

Views: 101

Answers (1)

Anand S Kumar
Anand S Kumar

Reputation: 90929

You can use boolean indexing with .loc . Example for taking upto a particular timestamp (not including that timestamp) -

df.loc[df.index < ts] #ts is the timestamp

If you want to include the timestamp ,you can do -

df.loc[df.index <= ts]

Demo -

In [14]: df
Out[14]:
        Date  ACCOCI_ARQ  ASSETSAVG_ART  ASSETSC_ARQ  ASSETSNC_ARQ  ASSETS_ARQ
1 2004-02-10   -31000000     6647000000   6029000000     942000000  6971000000
2 2004-03-27         NaN            NaN          NaN           NaN         NaN
3 2004-05-06   -10000000     6740500000   5784000000     951000000  6735000000
4 2004-06-26         NaN            NaN          NaN           NaN         NaN
5 2004-08-05   -18000000     6936000000   6286000000     937000000  7223000000

In [15]: ts = pd.to_datetime('2004-05-06')

In [19]: df = df.set_index('Date')

In [20]: df
Out[20]:
            ACCOCI_ARQ  ASSETSAVG_ART  ASSETSC_ARQ  ASSETSNC_ARQ  ASSETS_ARQ
Date
2004-02-10   -31000000     6647000000   6029000000     942000000  6971000000
2004-03-27         NaN            NaN          NaN           NaN         NaN
2004-05-06   -10000000     6740500000   5784000000     951000000  6735000000
2004-06-26         NaN            NaN          NaN           NaN         NaN
2004-08-05   -18000000     6936000000   6286000000     937000000  7223000000

In [21]: df.loc[df.index < ts]
Out[21]:
            ACCOCI_ARQ  ASSETSAVG_ART  ASSETSC_ARQ  ASSETSNC_ARQ  ASSETS_ARQ
Date
2004-02-10   -31000000     6647000000   6029000000     942000000  6971000000
2004-03-27         NaN            NaN          NaN           NaN         NaN

In [22]: df.loc[df.index <= ts]
Out[22]:
            ACCOCI_ARQ  ASSETSAVG_ART  ASSETSC_ARQ  ASSETSNC_ARQ  ASSETS_ARQ
Date
2004-02-10   -31000000     6647000000   6029000000     942000000  6971000000
2004-03-27         NaN            NaN          NaN           NaN         NaN
2004-05-06   -10000000     6740500000   5784000000     951000000  6735000000

Upvotes: 2

Related Questions