Ohad
Ohad

Reputation: 69

Compare each column to a different value efficiently

I have a numpy array of 4000*6 (6 column). And I have a numpy column (1*6) of minimum values (made from another numpy array of 3000*6). I want to find everything in the large array that is below those values. but each value to it's corresponding column.

I've tried the simple way, based on a one column solution I already had:

largearray=[float('nan') if x<min_values else x for x in largearray]

but sadly it didn't work :(.

I can do a for loop for each column and each value, but i was wondering if there is a faster more elegant solution.

Thanks

EDIT: I'll try to rephrase: I have 6 values, and 6 columns. i want to find the values in each column that are lower then the corresponding one from the 6 values. by array I mean a 2d array. sorry if it wasn't clear

sorry, i'm still thinking in Matlab a bit.

this my loop solution. It's on df, not numpy. still, is there a faster way?

a=0
for y in dfnames:
    df[y]=[float('nan') if x<minvalues[a] else x for x in df[y]]
    a=a+1

df is the large array or dataframe dfnames are the column names i'm interested in. minvalues are the minimum values for each column. I'm assuming that the order is the same. bad assumption, but works for now.

will appreciate any help making it better

Upvotes: 0

Views: 135

Answers (2)

Some
Some

Reputation: 514

I don't use numpy, so it may be not commont used solution, but such work:

largearray = numpy.array([[1,2,3], [3,4,5]])
minvalues =numpy.array([3,4,5])
largearray1=[(float('nan') if not numpy.all(numpy.less(x, min_values)) else x) for x in largearray]

result should be: [[1,2,3], 'nan']

Upvotes: 0

Ben
Ben

Reputation: 9713

I think you just need

result = largearray.copy()
result[result < min_values] = np.nan

That is, result is a copy of largearray but ay element less than the corresponding column of min_values is set to nan.

If you want to blank entire rows only when all entries in the row are less than the corresponding column of min_values, then you want:

result = largearray.copy()
result[np.all(result < min_values, axis=1)] = np.nan

Upvotes: 1

Related Questions