Create Column with different conditions

Question

I have the following extract from a larger Data Frame:

df = pd.DataFrame({'Tipe': ['DM','DM','DM','DS','DS','DI','DI','DM','DI','DS','DM','DM','DM','DM','DI','DM','DM','DS','DS','DS','DM']})

The objective is to create a column "D" where, for an index greater than "X", the program must search the column "Type" for the last n values and count only those that are identified as 'DM'.

for example, if "x" and "n" were 5, I expect something like this:

    Tipe             D
0    DM             NaN
1    DM             NaN
2    DM             NaN
3    DS             NaN
4    DS             NaN
5    DI             NaN
6    DI             2.0
7    DM             1.0
8    DI             1.0
9    DS             1.0
10   DM             1.0
11   DM             2.0
12   DM             3.0
13   DM             3.0
14   DI             4.0
15   DM             4.0
16   DM             4.0
17   DS             4.0
18   DS             3.0
19   DS             2.0
20   DM             2.0

I try with ".tail" but the existing 'DM' values throughout the entire column, not just the ones in the last n values.

Chris · Accepted Answer

Use pandas.Series.shift and rolling with where:

x = 5
n = 5

s = df["Type"].eq("DM").shift().rolling(n).sum()
df["D"] = s.where(s.index > x)

Output:

   Type    D
0    DM  NaN
1    DM  NaN
2    DM  NaN
3    DS  NaN
4    DS  NaN
5    DI  NaN
6    DI  2.0
7    DM  1.0
8    DI  1.0
9    DS  1.0
10   DM  1.0
11   DM  2.0
12   DM  3.0
13   DM  3.0
14   DI  4.0
15   DM  4.0
16   DM  4.0
17   DS  4.0
18   DS  3.0
19   DS  2.0
20   DM  2.0

Create Column with different conditions

Answers (1)

Related Questions