Reputation: 119
I am trying to delete all rows in a Pandas data frame that don't have a zero in either of two columns. My data frame is indexed from 0 to 620. This is my code:
for index in range(0, 621):
if((zeroes[index,1] != 0) and (zeroes[index,3] != 0)):
del(zeroes[index,])
I keep getting a key error. KeyError: (0, 1)
My instructor suggested I change the range to test to see if I have bad lines in my data frame. I did. I checked the tail of my dataframe and then changed the range to (616, 621). Then I got the key error: (616, 1).
Does anyone know what is wrong with my code or why I am getting a key error?
This code also produces a key error of (0,1):
index = 0
while (index < 621):
if((zeroes[index,1] != 0) and (zeroes[index,3] != 0)):
del(zeroes[index,])
index = index + 1
Upvotes: 1
Views: 1863
Reputation: 164773
Don't use a manual for
loop here. Your error probably occurs because df.__getitem__((x, y))
, which is effectively what df[x, y]
calls, has no significance.
Instead, use vectorised operations and Boolean indexing. For example, to remove rows where either column 1 or 3 do not equal 0:
df = df[df.iloc[:, [1, 3]].eq(0).any(1)]
This works because eq(0)
creates a dataframe of Boolean values indicating equality to zero and any(1)
filters for rows with any True
values.
The full form is df.iloc[:, [1, 3]].eq(0).any(axis=1)
, or df.iloc[:, [1, 3]].eq(0).any(axis='columns')
for even more clarity. See the docs for pd.DataFrame.any
for more details.
Upvotes: 1