Reputation: 1044
I have a dynamic multidimensional array, that can have a different number of columns each time. The user is asked to select which columns to extract from a file with N-columns, and based on this number, a multidimensional array 'ARRAY_VALUES' is created.
import numpy as num
DIRECTORY = '/Users/user/Desktop/'
DATA_DIC_FILE = "%sOUTPUT_DIC/OUTPUT_DICTIONARIES.txt" %(DIRECTORY)
Choice = str( raw_input( 'Which columns do you want to use (separated by a comma):\t' ) ).split(',')
# Input something like this: 1,2,3,4
String_choice = []
PCA_INDEX = []
Columns = len(Choice)
PCA_INDEX = {}
# PCA_INDEX is a dictionary that the key is a string whose value is a float number.
PCA_INDEX['any_string'] = float_number # The dictionary has about 50 entries.
ARRAY_VALUES = [ [] for x in xrange( Columns) ]
""" Creating the N-dimensional array that will contain the data from the file """
""" This list has the form ARRAY_VALUES = [ [], [], [], ... ] for n-repetitions. """
ARRAY_VALUES2 = ARRAY_VALUES
lines = open( DATA_DIC_FILE ).readlines() #Read lines from the file
for i in range( 0, len(ARRAY_VALUES) ):
ARRAY_VALUES[i] = num.loadtxt( fname = DATA_DIC_FILE, comments= '#', delimiter=',', usecols = [ int( PCA_INDEX[i] ) ], unpack = True )
""" This saves the lists from the file to the matrix 'ARRAY_VALUES' """
Now that I have the multidimensional array in the form of ARRAY_VALUES = [[], [], ...] for n-columns.
I want to eliminate the corresponding rows from each of the columns if any of the values are 'inf's. I tried to use the following code, but I don't know how to make it dynamic for the number of columns:
for j in range(0, len(ARRAY_VALUES)):
for i in range(0, len(ARRAY_VALUES[0])):
if num.isinf( ARRAY_VALUES[j][i] ) or num.isinf( ARRAY_VALUES[]): # This is where the problem is.
# if num.isinf( ARRAY_VALUES[0][i] ) or num.isinf(ARRAY_VALUES[1][i] or ... num.isinf(ARRAY_VALUES[last_column][i]:
continue
else:
ARRAY_VALUES2[j].append( ARRAY_VALUES[j][i] ) #Save the values into ARRAY_VALUES2.
Can anyone help me out and tell me how to do this part:
# if num.isinf( ARRAY_VALUES[0][i] ) or num.isinf(ARRAY_VALUES[1][i] or ... num.isinf(ARRAY_VALUES[last_column][i]:
for a multi-dimensional array with n-columns, so that the output is like the following:
ARRAY_VALUES = [ [8, 2, 3 , inf, 5],
[1, 9, inf, 4 , 5],
[7, 2, inf, inf, 6] ]
ARRAY_VALUES2 = [ [8, 2, 5],
[1, 9, 5],
[7, 2, 6] ]
--Thanks!
Upvotes: 0
Views: 140
Reputation: 113978
>>> a = np.array([[8, 2, 3 , np.inf, 5],[1, 9, np.inf, 4 , 5],[7, 2, np.inf, n
p.inf, 6]])
>>> col_mask = [i for i in range(ncols) if not any(a[:,i] == np.inf)]
>>> print a[:,col_mask]
[[ 8. 2. 5.]
[ 1. 9. 5.]
[ 7. 2. 6.]]
first use a numpy.array if you arent already.
then we iterate over each column and check for any np.infs to create a mask of allowable columns
lastly we just use numpy's column indexing to access only our columns of interest
as DSM points out you can create the mask with just numpy and avoid the list comprehension
col_mask = np.isfinite(a).all(axis=0)
Upvotes: 3