Reputation: 5907
I am using the R programming language. Suppose I have the following data:
set.seed(123)
a = rnorm(100000,10,10)
b = rnorm(100000,10,10)
my_data = data.frame(a,b)
plot(my_data$a, my_data$b)
When you look at this data:
head(my_data)
a b
1 4.395244 12.649934
2 7.698225 28.307475
3 25.587083 9.406217
4 10.705084 9.467906
5 11.292877 14.379042
6 27.150650 23.374490
My Question: Is there a way to find out if this dataset contains a point such that
Is there such a way to find out if this dataset contains a "global maximum point"?
For example, like the red point above. I know that in most cases, it is unlikely to find such a point, seeing that the point with the largest a-coordinate will not necessarily have the largest b-coordinate and vice versa:
#row with max value of "a"
which(my_data == max(my_data$a), arr.ind=TRUE)
row col
[1,] 23102 1
#row with max value of "b"
which(my_data == max(my_data$b), arr.ind=TRUE)
row col
[1,] 2071 2
#display row with max value of "a"
> my_data[23102,]
a b
23102 53.22815 4.500006
#display row with max value of "b"
> my_data[2071,]
a b
2071 15.85992 52.0609
As we can see, the row with the max value of "a" does not contain the max value of "b".
Thanks!
Note: In the real world, it is often impossible to find "global maximum points", as in big data the points "overlap" (different rows contain maximums of different columns). In the context of optimization problems, several points can often meet this criteria and are all considered suitable - these points are called "non-dominated" and said to be located on the "Pareto Frontier" (the green line):
Upvotes: 1
Views: 361
Reputation: 39154
Here is one way. Since the row number is 0, there is no such point with both a and b are maximum.
library(dplyr)
my_data %>%
filter(if_all(.fns = ~.x == max(.x)))
# [1] a b
# <0 rows> (or 0-length row.names)
Upvotes: 1