Dropping rows in python pandas

Question

I have the following DataFrame:

      2010-01-03  2010-01-04  2010-01-05  2010-01-06  2010-01-07
1560    0.002624    0.004992   -0.011085   -0.007508   -0.007508
14      0.000000   -0.000978   -0.016960   -0.016960   -0.009106
2920    0.000000    0.018150    0.018150    0.002648    0.025379
1502    0.000000    0.018150    0.011648    0.005963    0.005963
78      0.000000    0.018150    0.014873    0.014873    0.007564

I have list of indices corresponding to rows that I want to drop from my DataFrame. For simplicity, assume my list is idx_to_drop = [1560,1502] which correspond to the 1st row and 4th row in the daraframe above.

I tried to run df2 = df.drop(df.index[idx_to_drop]), but that expects row numbers rather than the .ix() index value. I have many more rows and many more columns, and getting row numbers by using the where() function takes a while.

How can I drop rows whose .ix() match?

Brian Pendleton · Accepted Answer

I would tackle this by breaking the problem into two pieces. Mask what you are looking for, then sub-select the inverse.

Short answer:

df[~df.index.isin([1560, 1502])]

Explanation with runnable example, using isin:

import pandas as pd
df = pd.DataFrame({'index': [1, 2, 3, 1500, 1501], 
                   'vals': [1, 2, 3, 4, 5]}).set_index('index')

bad_rows = [1500, 1501]
mask = df.index.isin(bad_rows)
print mask
[False False False  True  True]

df[mask]

       vals
index      
1500      4
1501      5

print ~mask
[ True  True  True False False]

df[~mask]

       vals
index      
1         1
2         2
3         3

You can see that we've identified the two bad rows, then we want to choose all the rows that aren't the bad ones. Our mask if for the bad rows, and all other rows would be anything that is not the mask (~mask)

Dropping rows in python pandas

Answers (1)

Related Questions