Delete specific rows based in conditions on rows from a dataframe pandas

Question

I want to delete specific rows based in conditions on rows from a Pandas dataframe.

For example, since I have several currency pairs at the same time, I intend to select only one of the currencies of the same time.

This is the priority: EUR, USD, GBP, CHF.

currency    timebuy buyprice
CNHUSD  2021-01-05 08:30:00 0,00005073
CNHGBP  2021-01-05 08:30:00 1,588
ZARGBP  2021-01-07 05:15:00 0,2727
ZARUSD  2021-01-07 05:15:00 300
ZAREUR  2021-01-07 13:00:00 0,1936
ZARCHF  2021-01-07 13:00:00 0,0000052
JPYCHF  2021-01-13 06:00:00 0,0002222
JPYUSD  2021-01-13 06:00:00 8
JPYGBP  2021-01-13 06:00:00 8

enter image description here

to

currency    timebuy buyprice
CNHUSD  2021-01-05 08:30:00 0,00005073
ZAREUR  2021-01-07 13:00:00 0,1936
JPYUSD  2021-01-13 06:00:00 8

enter image description here

Peter Leimbigler · Accepted Answer

Using groupby and reindex:

# Hard-code your priority for the second currency in each pair
pri = ['EUR', 'USD', 'GBP', 'CHF']

# Create a new column for the second currency of each pair
df['2ndcurr'] = df['currency'].str[-3:]


# Group by time and second currency,
# Sort inner level (1) of resulting MultiIndex to match priority,
# Group by the outer level (0),
# Get the first row of each group,
# Reset timebuy from index into its own column

(df.set_index(['timebuy', '2ndcurr'])
   .reindex(pri, level=1)
   .groupby(level=0)
   .first()
   .reset_index())

               timebuy currency    buyprice
0  2021-01-05 08:30:00   CNHUSD  0,00005073
1  2021-01-07 05:15:00   ZARUSD         300
2  2021-01-07 13:00:00   ZAREUR      0,1936
3  2021-01-13 06:00:00   JPYUSD           8

Delete specific rows based in conditions on rows from a dataframe pandas

Answers (2)

Related Questions