Reputation: 25
I would like to delete the last rows based on a condition. For example I have the following columns :
voltage | Current
0 10 | 0.8
1 12 | 0.7
3 14 | 0.6
4 0 | -0.0001
5 10 | 0.8
6 12 | 0.7
7 14 | 0.6
8 0 | -0.0001
9 0 | -0.0001
In this case, I want to remove the 2 last rows (when voltage = 0) without removing the 4th row.
I was thinking about a while loop that starts from the end of the dataframe and delete all rows with voltage = 0. It will stop when voltage is different from 0.
Any idea ?
Upvotes: 1
Views: 518
Reputation: 118
There are some good answers already but I saw all have some potential faults (at the time of posting). I have provided a solution that is a simple and fast filter.
Here we are saying:
Take the first index_filter number of rows regardless of value to capture any voltages you want to keep 0.
Then, we filter out anything where voltage == 0 (excluding the index_filter rows).
import pandas as pd
data = [[10, 0.8], [0, -0.0001], [12, 0.7], [0, -0.0001], [0, -0.0001]]
df = pd.DataFrame(columns=['voltage', 'Current'], data=data)
INDEX_FILTER = 1 # remember, index starts at 0 and we include this in our filter
df = df.loc[((df.index <= INDEX_FILTER) | (df['voltage'] != 0))]
I have used something similar in my electrical engineering labs to filter out many op amp circuits :).
Thanks
Upvotes: 2
Reputation: 662
Your idea of a reverse loop works just fine.
values = [
(10, 0.8),
(12, 0.7),
(14, 0.6),
(0, -0.0001),
(10, 0.8),
(12, 0.7),
(14, 0.6),
(0, -0.0001),
(0, -0.0001),
]
print(values)
for i in range(len(values) - 1, -1, -1):
if values[i][0] == 0:
del(values[i])
else:
break
print(values)
Output:
[(10, 0.8), (12, 0.7), (14, 0.6), (0, -0.0001), (10, 0.8), (12, 0.7), (14, 0.6), (0, -0.0001), (0, -0.0001)]
[(10, 0.8), (12, 0.7), (14, 0.6), (0, -0.0001), (10, 0.8), (12, 0.7), (14, 0.6)]
Upvotes: 2
Reputation: 195438
Try:
df = df[df["voltage"].replace(0, np.nan).bfill().notna()]
print(df)
Prints:
voltage Current
0 10 0.8000
1 12 0.7000
2 14 0.6000
3 0 -0.0001
4 10 0.8000
5 12 0.7000
6 14 0.6000
Upvotes: 2
Reputation: 50034
You can use .loc
to reverse the dataframe and then grab the .idxmax()
where voltage > 0. Then use loc
once again to keep everything up to that idxmax()
df = pd.DataFrame({"voltage":[10,12,14,0,10,12,14,0,0], "Current":[1,2,3,4,5,6,7,8,9]})
df.loc[:((df.loc[::-1]['voltage'] > 0) ).idxmax()]
+-------+---------+---------+
| index | voltage | Current |
+-------+---------+---------+
| 0 | 10 | 1 |
| 1 | 12 | 2 |
| 2 | 14 | 3 |
| 3 | 0 | 4 |
| 4 | 10 | 5 |
| 5 | 12 | 6 |
| 6 | 14 | 7 |
+-------+---------+---------+
Upvotes: 2