Reputation: 73
def encoder(expiry_dt,expiry1,expiry2,expiry3):
if expiry_dt == expiry1:
return 1
if expiry_dt == expiry2:
return 2
if expiry_dt == expiry3:
return 3
FINAL['Expiry_encodings'] = FINAL.apply(lambda row: '{0}_{1}_{2}_{3}_{4}'.format(row['SYMBOL'],row['INSTRUMENT'],row['STRIKE_PR'],row['OPTION_TYP'], encoder(row['EXPIRY_DT'],
row['Expiry1'],
row['Expiry2'],
row['Expiry3'])), axis =1)
The code runs totally fine but its too slow, is there any other alternative to achieve this in less time bound?
Upvotes: 0
Views: 106
Reputation: 12221
Give the following a try:
FINAL['expiry_number'] = '0'
for c in '321':
FINAL.loc[FINAL['EXPIRY_DT'] == FINAL['Expiry'+c], 'expiry_number'] = c
FINAL['Expiry_encodings'] = FINAL['SYMBOL'].astype(str) + '_' + \
FINAL['INSTRUMENT'].astype(str) + '_' + FINAL['STRIKE_PR'].astype(str) + \
'_' + FINAL['OPTION_TYP'].astype(str) + '_' + FINAL['expiry_number']
This avoids the three if
statements, has a default value ('0'
) if none of the if statements evaluates to True
, and avoids all the string formatting; above that, it also avoids the apply
method with a lambda
.
Note on the '321'
order: this reflects the order in which the if-chain in the original code section is evaluated: 'Expiry3'
has the lowest priority, and in my code given here, it is first overridden by #2 and then by #1. The original if-chain would shortcut at #1, given that the highest priority. For example, if 'Expiry1'
and 'Expiry3'
have the same value (equal to 'EXPIRY_DT'
), the assigned value is 1
, not 3
.
Upvotes: 3
Reputation: 73
Solution as same as above with slight change,
FINAL['expiry_number'] = '0'
for c in '321':
FINAL.loc[FINAL['EXPIRY_DT'] == FINAL['Expiry'+c], 'expiry_number'] = c
FINAL['Expiry_encodings'] = FINAL['SYMBOL'].astype(str) + '_' + \
FINAL['INSTRUMENT'].astype(str) + '_' + FINAL['STRIKE_PR'].astype(str) + \
'_' + FINAL['OPTION_TYP'].astype(str) +' _' + FINAL['expiry_number']
Upvotes: 0