8-Bit Borges
8-Bit Borges

Reputation: 10043

Pandas - delete rows with all zeros and delete columns using index of deleted rows

My df is a square sparse matrix, with the shape: (3862, 3862)

      0   1    ... 3862
0     0.0 0.0      0.0
1     0.0 0.0      0.0
...
3862  0.0 0.0  ... 0.0

And I need to get rid of rows with only 0.0, like so:

df.loc[~(df==0.0).all(axis=1)]

After which I end up with s matrix with shape (3819, 3862)


But I need to keep my matrix square.

So how do I keep track of index of deleted rows, and delete columns from those indexes as well, in order to end up with shape (3819,3819)?

Upvotes: 1

Views: 50

Answers (3)

Davide Laghi
Davide Laghi

Reputation: 126

# Simple df example
df = pd.DataFrame(
        [[0, 1, 2],
         [0, 0, 0],
         [3, 0, 0]])

# Condition for rows to be kept
condition = ~(df==0.0).all(axis=1)

# Get the index of rows that satisfy condition
idx2keep = df.loc[condition].index

# Retain the columns and rows with good index
df.loc[idx2keep, idx2keep]

OUTPUT:

    0   2
0   0   2
2   3   0

Upvotes: 1

BENY
BENY

Reputation: 323366

You can try

idx = df.index[(df==0.0).all(axis=1)]
out = df.drop(idx,axis=1).drop(idx)

Upvotes: 2

piterbarg
piterbarg

Reputation: 8219

Use the transpose operator, .T:

df = pd.DataFrame(columns = [0,1,2,3], data = [
        [0,0,0,0],
        [0,1,0,0],
        [0,0,0,1],
        [0,0,1,0],
        ])

empty_rows = (df==0.0).all(axis=1)
df.drop(df.index[empty_rows]).T.drop(df.index[empty_rows]).T

output (both row 0 and column 0 got dropped


    1   2   3
1   1   0   0
2   0   0   1
3   0   1   0

Alternatively you can use iloc if you row and column index values line up with their order:

df.iloc[df.index[~empty_rows],df.index[~empty_rows]]

this produces the same output

Upvotes: 0

Related Questions