steff
steff

Reputation: 906

vectorised solution to change value in pd.df column based on boolean

My df looks like this:

              code       date type  strike  settlement
0   CBT_21_G2015_S 2015-01-02    C   126.2    1.343750
1   CBT_21_G2015_S 2015-01-02    P   131.7    4.359375
2   CBT_21_G2015_S 2015-01-02    C   102.5   24.671875
3   CBT_21_G2015_S 2015-01-02    P   110.5    0.015625
4   CBT_21_G2015_S 2015-01-02    P   101.2    0.015625
5   CBT_21_G2015_S 2015-01-02    C   140.5    0.015625

i am looking to change the strikes to quarter strikes by doing something like this: if df['strike'] % 0.25 != 0 add 0.05.

Desired output:

              code       date type  strike  settlement
0   CBT_21_G2015_S 2015-01-02    C   126.25   1.343750
1   CBT_21_G2015_S 2015-01-02    P   131.75   4.359375
2   CBT_21_G2015_S 2015-01-02    C   102.5   24.671875
3   CBT_21_G2015_S 2015-01-02    P   110.5    0.015625
4   CBT_21_G2015_S 2015-01-02    P   101.25   0.015625
5   CBT_21_G2015_S 2015-01-02    C   140.5    0.015625

whats the easiest/fastest way to do this pls?

Upvotes: 1

Views: 27

Answers (2)

BENY
BENY

Reputation: 323226

You need np.where

df.strike = np.where(df.strike % 0.25 == 0, df.strike, df.strike + 0.05)
df

             code        date type  strike  settlement
0  CBT_21_G2015_S  2015-01-02    C  126.25    1.343750
1  CBT_21_G2015_S  2015-01-02    P  131.75    4.359375
2  CBT_21_G2015_S  2015-01-02    C  102.50   24.671875
3  CBT_21_G2015_S  2015-01-02    P  110.50    0.015625
4  CBT_21_G2015_S  2015-01-02    P  101.25    0.015625
5  CBT_21_G2015_S  2015-01-02    C  140.50    0.015625

Upvotes: 3

cs95
cs95

Reputation: 402373

A little mathemagic with np.ceil -

df['strike'] = np.ceil(df.strike * 4) / 4

df
             code        date type  strike  settlement
0  CBT_21_G2015_S  2015-01-02    C  126.25    1.343750
1  CBT_21_G2015_S  2015-01-02    P  131.75    4.359375
2  CBT_21_G2015_S  2015-01-02    C  102.50   24.671875
3  CBT_21_G2015_S  2015-01-02    P  110.50    0.015625
4  CBT_21_G2015_S  2015-01-02    P  101.25    0.015625
5  CBT_21_G2015_S  2015-01-02    C  140.50    0.015625

It's really fast, as timings show.

df = pd.concat([df] * 100000, ignore_index=True)

%timeit np.ceil(df.strike.values * 4) / 4
5.1 ms ± 60.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Upvotes: 3

Related Questions