Reputation: 438
My question is quite straightforward and there is probably a really simple way to solve this which I couldn't find out. So, firstly, I concatenate some arrays, and then I want to find the combination of the first and second column (data_x1, data_x2) that returns me the maximum value of y. However, there is one constraint, I want to limit all the x between -20 and 20, if it is more than 20 or less than -20, I want to ignore this value.
Also I am using this process inside a function, hence I am really looking for a way which may work for a n-number of 'x'. Summarizing: I want to find out the optimal y for the constrained data_x1 and data_x2, that means, the optimal value in the row data_y which correspond to the value of the data_x1 and data_x2 that are bounded by the aforementioned condition ( < 20 and > -20). For example, in this dataset that I am providing, the row with contains the maximum data_y is beyond the conditions that I am imposing. Example, when I try:
y_max = data_y.max()
ID = data_y.argmax()
x1_max = data_x1[ID]
x2_max = data_x2[ID]
I will have x2_2 beyond the limit that I want to impose.
Here is the dataset:
data_x1 = np.array([ 7.50581267e-01, 4.85312598e+00, -1.37035821e+00, -1.27199171e-03,
-1.61347902e+00, -2.47705419e+00, 1.54149227e-01, 2.96462913e+00,
6.39336584e+00, 2.22526551e+00, -3.13825557e+00, -4.53521105e+00,
3.66632759e+00, 6.95980810e-01, -2.08555389e+00, -3.42268057e+00,
-2.67733126e+00, 3.44611056e+00, -3.21242281e-01, -4.45557410e+00,
2.36357280e+00, 6.76143624e-01, -1.12756068e+00, 1.56898158e+00,
-2.73721604e+00, 2.63754963e+00, -4.52874687e+00, -2.96449234e+00,
-4.38481329e+00, -1.50384134e+00, -2.52651726e+00, -1.34210192e+00,
-2.39860669e-01, 1.40859346e+00, 1.85432054e-01, 5.01414945e-01,
4.55880766e+00, -1.05363585e+00, -4.62917198e+00, 2.59998127e+00,
5.25344447e+00, 3.07701918e-01, 2.26443850e+00, -2.22101423e+00,
3.02861897e-01, 1.65691179e+00, 8.81562566e-01, -1.87325712e+00,
4.63772521e+00, 2.64284088e-01, 2.53643045e+00, 9.63172795e-01,
2.36685850e+00, 2.54559573e+00, -9.02629613e-01, 2.24687227e+00,
6.22720302e+00, 5.74281188e+00, 2.03796010e+00, 4.80760151e+00])
data_x2 = np.array([-30.09938636, -28.83362992, -22.57425202, -23.14358566,
-33.59852454, -27.51674098, -30.7885103 , -25.90249062,
-22.08337401, -29.07237476, -23.04023689, -30.30583811,
-21.00309374, -29.99686696, -28.90991919, -26.62903318,
-31.72168863, -22.87107873, -30.729956 , -25.6780506 ,
-31.38729541, -27.19055645, -27.55148381, -28.68462801,
-26.05224771, -30.87040206, -22.95430799, -26.91256322,
-35.8942374 , -21.50322056, -26.16176442, -22.85920962,
-28.05071496, -34.30775127, -28.7790589 , -31.19811517,
-27.63535267, -28.96808588, -26.89286845, -32.81312953,
-27.35855807, -28.89865079, -25.61937868, -32.59681293,
-28.79511822, -22.54470727, -31.06309398, -25.30574423,
-23.52838694, -27.55017459, -24.55437336, -24.39558638,
-22.81063876, -28.62340189, -27.85680254, -25.10753673,
-29.75683744, -27.37575317, -29.61561727, -34.50702866]
data_y = np.array([2511661.54014723, 2506471.03096404, 2496512.87703406,
2500666.09145807, 2492786.42701569, 2513191.79101637,
2509515.1829362 , 2509970.89367091, 2481463.90896938,
2512505.17266542, 2496999.56860772, 2503950.65803291,
2481665.31885133, 2511985.61283778, 2512968.70827174,
2510599.791468 , 2502795.50006905, 2495342.7106848 ,
2509708.93248061, 2505715.61726413, 2504986.68522465,
2514933.54167635, 2514835.36052355, 2513916.01349115,
2510784.07070835, 2506718.40944214, 2493199.57962053,
2511925.51820147, 2466117.27254433, 2488828.88557003,
2511417.16267116, 2498364.67720219, 2515221.17931068,
2487471.40157182, 2514636.01655828, 2507757.43933369,
2508292.40113149, 2514000.76143246, 2507722.80700035,
2496671.63747914, 2505965.77313117, 2514453.85665244,
2510375.19913626, 2498705.33749204, 2514595.64115671,
2496054.0775116 , 2508144.96504256, 2509901.46588431,
2496183.49020786, 2515239.10310988, 2506016.58240813,
2507055.51518852, 2496891.65309883, 2512606.04865712,
2515010.58385846, 2508707.73815183, 2499240.78218084,
2504177.72406016, 2511686.21461949, 2477825.15797829])
Hope that I managed to be succinct and precise albeit the length of the explanation. I would really appreciate your help on this one!
Upvotes: 3
Views: 1443
Reputation: 409
Your data_x2 contains no values between -20 and 20.
If you can use pandas for this, you can do (example is for -30 < x < 30)
import pandas as pd
df = pd.DataFrame({'x1': data_x1, 'x2': data_x2, 'y': data_y})
df = df[df['x1'].between(-30, 30, inclusive=False) & df['x2'].between(-30, 30, inclusive=False)]
df.sort_values(by='y', ascending=False).iloc[0]
Output:
x1 2.642841e-01
x2 -2.755017e+01
y 2.515239e+06
Name: 49, dtype: float64
Here's a function for calculating this. (Again using pandas)
def func(x1, x2, y, lower_bound, upper_bound):
df = pd.DataFrame({'x1': x1, 'x2': x2, 'y': y})
df = df[df['x1'].between(lower_bound, upper_bound, inclusive=False) & df['x2'].between(lower_bound, upper_bound, inclusive=False)]
df.sort_values(by='y', ascending=False, inplace=True)
if len(df):
return df['x1'].iloc[0], df['x2'].iloc[0]
func(data_x1, data_x2, data_y, -20, 20)
Output:
None
func(data_x1, data_x2, data_y, -30, 30)
Output:
(0.264284088, -27.55017459)
EDIT:
Using pandas DataFrame is nice because it treats your data as a matrix where you can slice based on values in multiple columns. The numpy solution below works, but requires replacing values that are outside of your range with np.nan in order to keep your indexes the same.
Here's a pure numpy solution with help from Removing nan in array at position from another numpy array
data_x1 = np.where(np.logical_and(data_x1 > -30, data_x1 < 30), data_x1, np.nan)
data_x2 = np.where(np.logical_and(data_x2 > -30, data_x2 < 30), data_x2, np.nan)
mask = ~np.isnan(data_x1) & ~np.isnan(data_x2)
data_y = np.where(mask, data_y, np.nan)
idx = np.nanargmax(data_y)
data_x1[idx], data_x2[idx]
Output:
(0.264284088, -27.55017459)
Although, I would agree with Evgeny and use Pandas DataFrame's as it is easier to follow IMO
Upvotes: 3
Reputation: 4551
So, firstly, I concatenate some arrays,
Three vectors?
and then I want to find the combination of the first and second column (data_x1, data_x2) that returns me the maximum value of y.
Just one row?
However, there is one constraint, I want to limit all the x between -20 and 20, if it is more than 20 or less than -20, I want to ignore this value.
See question above.
What prevents you from filtering the dataframe by condition on x1 and x2 and finding the y max position afterwards?
I'd suggest to wrap the numpy vectors in a dataframe to make your work on them together easier.
Argmax on dataframe is described here Find row where values for column is maximal in a pandas DataFrame
You may need to eliminate the unsatisfying x's before finding the y. If several y's needed sort by y.
Upvotes: 0