Drop zero values at the beginning of a time series in pandas

Question

I'm using Python 3.6. I have this kind of time series with class 'pandas.core.frame.DataFrame':

              value
 index
2019-01-01    0
2019-02-01    0
2019-03-01    1577
2019-04-01    1715
2019-05-01    1787
2019-06-01    0
2019-07-01    1787

I want to delete the first two rows but no the one corresponding to June 2019. The output will be:

              value
 index
2019-03-01    1577
2019-04-01    1715
2019-05-01    1787
2019-06-01    0
2019-07-01    1787

I can't use iterrows() because I have a time series pandas format.

rafaelc · Accepted Answer

If you just need to remove these two rows, you can simply do df.iloc[2:]

For a generalized soluton, you may use cumprod

df.loc[~df.value.eq(0).cumprod().astype(bool)]

           value
2019-03-01   1577
2019-04-01   1715
2019-05-01   1787
2019-06-01      0
2019-07-01   1787

Detail:

>>> df.value.eq(0).cumprod()

2019-01-01    1
2019-02-01    1
2019-03-01    0
2019-04-01    0
2019-05-01    0
2019-06-01    0
2019-07-01    0

An alternative (likely nicer) solution as suggest by @user3483203

df.loc[df['value'].ne(0).idxmax():]

           value
2019-03-01   1577
2019-04-01   1715
2019-05-01   1787
2019-06-01      0
2019-07-01   1787

Drop zero values at the beginning of a time series in pandas

Answers (1)

Related Questions