Reputation: 10043
My df
is a square sparse matrix, with the shape:
(3862, 3862)
0 1 ... 3862
0 0.0 0.0 0.0
1 0.0 0.0 0.0
...
3862 0.0 0.0 ... 0.0
And I need to get rid of rows with only 0.0, like so:
df.loc[~(df==0.0).all(axis=1)]
After which I end up with s matrix with shape (3819, 3862)
But I need to keep my matrix square.
So how do I keep track of index of deleted rows, and delete columns from those indexes as well, in order to end up with shape (3819,3819)
?
Upvotes: 1
Views: 50
Reputation: 126
# Simple df example
df = pd.DataFrame(
[[0, 1, 2],
[0, 0, 0],
[3, 0, 0]])
# Condition for rows to be kept
condition = ~(df==0.0).all(axis=1)
# Get the index of rows that satisfy condition
idx2keep = df.loc[condition].index
# Retain the columns and rows with good index
df.loc[idx2keep, idx2keep]
OUTPUT:
0 2
0 0 2
2 3 0
Upvotes: 1
Reputation: 323366
You can try
idx = df.index[(df==0.0).all(axis=1)]
out = df.drop(idx,axis=1).drop(idx)
Upvotes: 2
Reputation: 8219
Use the transpose operator, .T
:
df = pd.DataFrame(columns = [0,1,2,3], data = [
[0,0,0,0],
[0,1,0,0],
[0,0,0,1],
[0,0,1,0],
])
empty_rows = (df==0.0).all(axis=1)
df.drop(df.index[empty_rows]).T.drop(df.index[empty_rows]).T
output (both row 0 and column 0 got dropped
1 2 3
1 1 0 0
2 0 0 1
3 0 1 0
Alternatively you can use iloc if you row and column index values line up with their order:
df.iloc[df.index[~empty_rows],df.index[~empty_rows]]
this produces the same output
Upvotes: 0