Reputation: 8291
From the given table of inflation rates below, I want to obtain the countries with negative inflation rates for two consecutive years.
2017 2018 2019 2020 2021 2022
Country
Turkey NaN 47.0 -7.0 -19.0 38.0 260.0
Argentina NaN 33.0 56.0 -22.0 15.0 8.0
Suriname NaN -68.0 -37.0 695.0 56.0 13.0
Zimbabwe NaN 106.0 2306.0 118.0 -83.0 -21.0
Lebanon NaN 2.0 -36.0 2826.0 82.0 39.0
Sudan NaN 96.0 -19.0 220.0 19.0 34.0
Venezuela NaN 1482.0 -70.0 -88.0 15.0 -89.0
I have seen some solutions in SO that use list comprehension or loops. I wonder if this task is possible without them.
I attempted to convert the dataframe into 1s and 0s, in which 1.0 indicates a negative inflation.
2017 2018 2019 2020 2021 2022
Country
Turkey NaN 0.0 1.0 1.0 0.0 0.0
Argentina NaN 0.0 0.0 1.0 0.0 0.0
Suriname NaN 1.0 1.0 0.0 0.0 0.0
Zimbabwe NaN 0.0 0.0 0.0 1.0 1.0
Lebanon NaN 0.0 1.0 0.0 0.0 0.0
Sudan NaN 0.0 1.0 0.0 0.0 0.0
Venezuela NaN 0.0 1.0 1.0 0.0 1.0
However, I am stuck at this point. I tried to use np.prod
function but this returns 0 if at least one column as 0.0 data.
Any ideas about how to solve this problem?
Upvotes: 1
Views: 162
Reputation: 765
def function1(ss:pd.Series):
ss.loc['col1']=ss.rolling(2).apply(lambda ss1:ss1.iloc[0]<0 and ss1.iloc[1]<0).eq(1).any()
return ss
df1.set_index('Country').apply(function1,axis=1).query('col1')
out
2017 2018 2019 2020 2021 2022 col1
Country
Turkey NaN 47.0 -7.0 -19.0 38.0 260.0 True
Suriname NaN -68.0 -37.0 695.0 56.0 13.0 True
Zimbabwe NaN 106.0 2306.0 118.0 -83.0 -21.0 True
Venezuela NaN 1482.0 -70.0 -88.0 15.0 -89.0 True
Upvotes: 1
Reputation: 260580
You can first set an integer mask for the negative values (1 means negative). Then compute a rolling min on the axis 1, of the min is 1 all values are. This is generalizable to any number of consecutive columns.
N = 2
m1 = df.lt(0).astype(int)
m2 = m.rolling(N, axis=1).min().eq(1).any(axis=1)
df[m2]
Output:
2017 2018 2019 2020 2021 2022
Country
Turkey NaN 47.0 -7.0 -19.0 38.0 260.0
Suriname NaN -68.0 -37.0 695.0 56.0 13.0
Zimbabwe NaN 106.0 2306.0 118.0 -83.0 -21.0
Venezuela NaN 1482.0 -70.0 -88.0 15.0 -89.0
NB. One needs to work with integers as rolling is currently limited to numeric types
Alternative with a single mask for N=2
m = df.lt(0)
df[(m&m.shift(axis=1)).any(axis=1)]
Upvotes: 3
Reputation: 323226
Try with rolling
out = df[df.le(0).T.rolling(window=2).sum().ge(2).any()]
Out[15]:
2017 2018 2019 2020 2021 2022
Country
Turkey NaN 47.0 -7.0 -19.0 38.0 260.0
Suriname NaN -68.0 -37.0 695.0 56.0 13.0
Zimbabwe NaN 106.0 2306.0 118.0 -83.0 -21.0
Venezuela NaN 1482.0 -70.0 -88.0 15.0 -89.0
Upvotes: 2
Reputation: 1137
Given your dataframe, this is what would work for me:
So it would look like:
df.set_index('Country',inplace=True)
df['TwoNegatives'] = ((df.values < 0) & ((df.shift(axis=1)).values <0)).any(axis=1)
Upvotes: 2
Reputation: 93161
Try this:
match = (df.lt(0) & df.shift(axis=1).lt(0)).any(axis=1)
df[match]
How it works:
df.lt(0)
: current year inflation is less than 0df.shift(axis=1).lt(0)
: previous year inflation is less than 0.any(axis=1)
: any such occurrence in the country.Upvotes: 2