Reputation: 1993
I am trying to remove a dataframes rows if the index is between two timestamps.
I have the following dataframe
pandas_data[c['key']]
Price
Timestamp
2021-08-03 14:02:44.284 19.8753
2021-08-03 14:02:44.994 19.8756
2021-08-03 14:02:45.703 19.8750
2021-08-03 14:02:46.721 19.8716
2021-08-03 14:02:47.453 19.8709
2021-08-03 14:02:48.241 19.8707
2021-08-03 14:02:48.698 19.8710
2021-08-03 14:02:49.735 19.8711
2021-08-03 14:02:52.180 19.8711
2021-08-03 14:02:52.960 19.8711
2021-08-03 14:02:55.177 19.8715
2021-08-03 14:02:56.209 19.8711
2021-08-03 14:02:56.698 19.8695
2021-08-03 14:02:57.724 19.8696
2021-08-03 14:02:59.221 19.8694
2021-08-03 14:03:00.188 19.8687
and a second dataframe
df0
Price
open high low close
Timestamp
2021-08-03 14:02:40 19.8753 19.8756 19.8753 19.8756
2021-08-03 14:02:45 19.8750 19.8750 19.8707 19.8711
2021-08-03 14:02:50 19.8711 19.8711 19.8711 19.8711
2021-08-03 14:02:55 19.8715 19.8715 19.8694 19.8694
2021-08-03 14:03:00 19.8687 19.8687 19.8687 19.8687
I get the last two timestamps from df0 as follows:
oldest_timestamp = pd.to_datetime(df0.iloc[0].name.timestamp(), unit='s')
second_oldest_timestamp = oldest_timestamp = d.to_datetime(df0.iloc[1].name.timestamp(), unit='s')
and I want to remove any rows in pandas_data[c['key']] where the timestamp falls in between the oldest_timestamp and the second_oldest_timestamp.
I can do this and it seems to work for a single comparison:
pandas_data[c['key']].loc[pandas_data[c['key']].index > oldest_timestamp]
Price
Timestamp
2021-08-03 14:02:45.703 19.8750
2021-08-03 14:02:46.721 19.8716
2021-08-03 14:02:47.453 19.8709
2021-08-03 14:02:48.241 19.8707
2021-08-03 14:02:48.698 19.8710
2021-08-03 14:02:49.735 19.8711
2021-08-03 14:02:52.180 19.8711
2021-08-03 14:02:52.960 19.8711
2021-08-03 14:02:55.177 19.8715
2021-08-03 14:02:56.209 19.8711
2021-08-03 14:02:56.698 19.8695
2021-08-03 14:02:57.724 19.8696
2021-08-03 14:02:59.221 19.8694
2021-08-03 14:03:00.188 19.8687
But when I try to do multiple comparisons I am getting an error.
pandas_data[c['key']].loc[pandas_data[c['key']].index > oldest_timestamp and pandas_data[c['key']].index < second_oldest_timestamp]
Traceback (most recent call last):
File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydevd_bundle/pydevd_exec2.py", line 3, in Exec
exec(exp, global_vars, local_vars)
File "<input>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Upvotes: 2
Views: 71
Reputation: 405765
Use the &
operator to combine conditions inside loc
.
pandas_data[c['key']].loc[(pandas_data[c['key']].index > oldest_timestamp) & (pandas_data[c['key']].index < second_oldest_timestamp)]
Note also that you have to put the two separate conditions in parentheses because &
has higher precedence than <
and >
.
See: Logical operators for boolean indexing in Pandas
Upvotes: 1