Reputation: 877
I have a list called reassembly organized like this:
['AFLT', 228468.0, 'B'],
['TATN', 1108.6, 'B'],
['TATN', 4434.4, 'B'],
['MOEX', 3480.0, 'S'],
['YNDX', 5934.0, 'B'],
['MTSS', 36003.0, 'S'],
['SBERP', 33837.1, 'S'],
['SBERP', 1780.8, 'S'],
['MTSS', 3273.0, 'S'],
['AFLT', 124356.0, 'B'],
['AFLT', 20244.0, 'B'],
['MGNT', 72990.0, 'B'],
['NLMK', 230917.0, 'B'],
['NLMK', 156050.0, 'B'],
['NLMK', 31220.0, 'B'],
['MGNT', 36450.0, 'S'],
['TCSG', 14045.2, 'S'],
['TCSG', 2160.4, 'S'],
Also there is a dictionary called medians with data:
{'TATNP': 11968.05, 'TCSG': 8647.2, 'TRNFP': 130250.0, 'UPRO': 7941.0, 'VTBR': 3828.28, 'YNDX': 17660.4}
Keys in dictionary are equivalent to first values in list ( 'AFLT', 'VTBR' and others)
I convert reassembly to pandas:
df = pd.DataFrame(reassembly, columns=['ticker','vol','operation'])
Now I want to do something like this:
df = df[df['vol'] < median['ticker']]
I mean if vol < median for this ticker script should ignore it.
Help me please to write this code correctly.
Upvotes: 3
Views: 255
Reputation: 1
I suggest solving this with a list comprehension and pipe the result into panda instead.
reassembly = [['AFLT', 228468.0, 'B'],
['TATN', 1108.6, 'B'],
['TATN', 4434.4, 'B'],
['MOEX', 3480.0, 'S'],
['YNDX', 5934.0, 'B'],
['MTSS', 36003.0, 'S'],
['SBERP', 33837.1, 'S'],
['SBERP', 1780.8, 'S'],
['MTSS', 3273.0, 'S'],
['AFLT', 124356.0, 'B'],
['AFLT', 20244.0, 'B'],
['MGNT', 72990.0, 'B'],
['NLMK', 230917.0, 'B'],
['NLMK', 156050.0, 'B'],
['NLMK', 31220.0, 'B'],
['MGNT', 36450.0, 'S'],
['TCSG', 14045.2, 'S'],
['TCSG', 2160.4, 'S']]
medians = {'TATNP': 11968.05, 'TCSG': 8647.2, 'TRNFP': 130250.0, 'UPRO': 7941.0, 'VTBR': 3828.28, 'YNDX': 17660.4}
ready_for_panda = [x for x in reassembly if x[0] in medians and x[1] > medians[x[0]]]
pd.DataFrame(ready_for_panda, columns=["ticker", "vol", "operation"])
ticker vol operation
TCSG 14045.2 S
I have assumed that you want to filter out any element from reassembly where the volume is less than the current median for this ticker.
Upvotes: 0
Reputation: 150765
You want map
:
high_volumes = df[df['vol'] > df['ticker'].map(medians)]
# do suff with high volume transaction
Note that the above can fail if you don't have all the tickers
in medians
. In which case, let say you want to keep all those tickers
that are not in medians
:
meds = df['ticker'].map(medians)
high_volumes = df[(df['vol']>meds)|(meds.isna())]
Upvotes: 4