Reputation: 87
I need to fetch the rows per code (where serial is 2) where value of new (where serial is 2) >= value of sold (where serial is 1)
and
value of new (where serial is 2) < value of sold (where serial is 2)
Formula: new[2] >= sold[1] and new[2] < sold[2]
, [2]/[1] are the serial (and thats why I tried to put the index on serial).
Sample Dataframe (data): data.set_index('serial')
code | serial | date | new | sold | |
---|---|---|---|---|---|
0 | 20113 | 1 | 2019-01-30 | 1344.4 | 1344.95 |
1 | 20113 | 2 | 2019-02-30 | 1345.35 | 1344.9 |
2 | 20113 | 3 | 2019-03-30 | 1347.4 | 1345.35 |
3 | 20286 | 1 | 2019-01-30 | 590.55 | 590.15 |
4 | 20286 | 2 | 2019-02-30 | 590.15 | 590.55 |
5 | 20286 | 3 | 2019-03-30 | 590.4 | 590.15 |
It gives an error:
File "C:#########\Python39\lib\site-packages\pandas\core\indexes\base.py", line 3361, in get_loc return self._engine.get_loc(casted_key) File "pandas_libs\index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas_libs\hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: True
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "F:\python_projects#########\testing.py", line 12, in status = globals()[f"strategy_{row[0]}"].pre_check_condition_panda(5,now)#(row[2],now) File "F:\python_projects#########\strategy.py", line 88, in pre_check_condition_panda data1 = data[(data.open[2] >= data.close[1]) & (data.open[2] > data.close[2])] File "C:#########\Python39\lib\site-packages\pandas\core\frame.py", line 3455, in getitem indexer = self.columns.get_loc(key) File "C:#########\Python39\lib\site-packages\pandas\core\indexes\base.py", line 3363, in get_loc raise KeyError(key) from err KeyError: True
My code:
data1 = data[(data.new[2] >= data.sold[1]) & (data.new[2] < data.sold[2])]
print(data1)
Expected Result:
code | serial | date | new | sold | |
---|---|---|---|---|---|
4 | 20286 | 2 | 2019-02-30 | 590.15 | 590.55 |
Upvotes: 0
Views: 778
Reputation: 262304
Using stack/unstack you can get the "code" that matches the requirement, but the condition to get only serial == 2 is unclear:
df2 = df.set_index(['code', 'serial']).unstack()
(df2.loc[df2[('new', 2)].ge(df2[('sold', 1)])
&df2[('new', 2)].lt(df2[('sold', 2)])
]
.stack(level=1)
.reset_index()
)
output:
code serial date new sold
0 20286 1 2019-01-30 590.55 590.15
1 20286 2 2019-02-30 590.15 590.55
2 20286 3 2019-03-30 590.40 590.15
If you only want serial == 2, you can add .query('serial == 2')
:
df2 = df.set_index(['code', 'serial']).unstack()
(df2.loc[df2[('new', 2)].ge(df2[('sold', 1)])
&df2[('new', 2)].lt(df2[('sold', 2)])
]
.stack(level=1)
.reset_index()
.query('serial == 2')
)
output:
code serial date new sold
1 20286 2 2019-02-30 590.15 590.55
Upvotes: 2