Rayyan Khan
Rayyan Khan

Reputation: 117

How to select non-zero columns in a matrix in python

Suppose I have data with following format:

C0 C1 C2 C3 C4 C5 C6 C7 C8
0  0  0  0  0  0  0  0  0
0  0  0  0  0  0  0  0  0
0  0  2  3  4  5  6  0  0
0  1  4  5  6  7  8  0  0
0  0  0  0  0  0  0  0  0

I want to select non-zero columns, such that column C1, C2, C3, C4, C5, C6 in python. Any command that can directly give me desired format.

Upvotes: 3

Views: 2515

Answers (4)

With numpy:

import numpy as np 

a = np.array([[0,0,0,0,0,0,0,0,0],
              [0,0,0,0,0,0,0,0,0],
              [0,0,2,3,4,5,6,0,0],
              [0,1,4,5,6,7,8,0,0],
              [0,1,4,5,6,7,8,0,0]])

r = np.nonzero(np.any(a != 0, axis=0))[0]

>>> r
[1 2 3 4 5 6]

If you need those as column names (C1, C2, C3, C4, C5, C6), use pandas:

columns = ['C0', 'C1', 'C2', 'C3', 'C4', 'C5', 'C6', 'C7', 'C8']
s = pd.DataFrame(data=a, columns=columns).any()

s = s[s == 1]

>>> s
C1    True
C2    True
C3    True
C4    True
C5    True
C6    True
dtype: bool

Upvotes: 1

user3483203
user3483203

Reputation: 51155

You can use any along with numpy indexing to select columns with non-zero values.

Setup

a = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
              [0, 0, 0, 0, 0, 0, 0, 0, 0],
              [0, 0, 2, 3, 4, 5, 6, 0, 0],
              [0, 1, 4, 5, 6, 7, 8, 0, 0],
              [0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=int64)

a[:, a.any(0)]

array([[0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 2, 3, 4, 5, 6],
       [1, 4, 5, 6, 7, 8],
       [0, 0, 0, 0, 0, 0]], dtype=int64)

Upvotes: 7

Green Cloak Guy
Green Cloak Guy

Reputation: 24691

Assume that your matrix is implemented as a list of lists, where the first index is the column, the second index is the row:

matrix[3][2] == 3

Then you can use a list comprehension to get a list of only the columns in matrix that are not all zeroes

nonzero_columns = [column for column in matrix if any(column)]
# any() will return true here if any element of column is nonzero

Upvotes: 0

Maheshwar Kuchana
Maheshwar Kuchana

Reputation: 92

If you use a library like pandas then it is way more simpler

You just take mean of each column and if they are greater than 0 they are your required columns

For that I will give you a piece of code:

import pandas as pd

df = pd.read_csv("File Path")
a = df.mean(axis=0) #gives you column wise mean
for i in len(a):
   if a[i] > 0:
      print(i)  # i will be your column

Upvotes: 1

Related Questions