stats_noob
stats_noob

Reputation: 5907

R: Find the Maximum Point in a Dataset

I am using the R programming language. Suppose I have the following data:

set.seed(123)

a = rnorm(100000,10,10)
b = rnorm(100000,10,10)

my_data = data.frame(a,b)

plot(my_data$a, my_data$b)

enter image description here

When you look at this data:

 head(my_data)
          a         b
1  4.395244 12.649934
2  7.698225 28.307475
3 25.587083  9.406217
4 10.705084  9.467906
5 11.292877 14.379042
6 27.150650 23.374490

My Question: Is there a way to find out if this dataset contains a point such that

Is there such a way to find out if this dataset contains a "global maximum point"?

enter image description here

For example, like the red point above. I know that in most cases, it is unlikely to find such a point, seeing that the point with the largest a-coordinate will not necessarily have the largest b-coordinate and vice versa:

#row with max value of "a"
which(my_data == max(my_data$a), arr.ind=TRUE)

       row col
[1,] 23102   1

#row with max value of "b"
 which(my_data == max(my_data$b), arr.ind=TRUE)
      row col
[1,] 2071   2

#display row with max value of "a"
> my_data[23102,]
             a        b
23102 53.22815 4.500006

#display row with max value of "b"
> my_data[2071,]
            a       b
2071 15.85992 52.0609

As we can see, the row with the max value of "a" does not contain the max value of "b".

Thanks!

Note: In the real world, it is often impossible to find "global maximum points", as in big data the points "overlap" (different rows contain maximums of different columns). In the context of optimization problems, several points can often meet this criteria and are all considered suitable - these points are called "non-dominated" and said to be located on the "Pareto Frontier" (the green line):

enter image description here

Upvotes: 1

Views: 361

Answers (1)

www
www

Reputation: 39154

Here is one way. Since the row number is 0, there is no such point with both a and b are maximum.

library(dplyr)

my_data %>%
  filter(if_all(.fns = ~.x == max(.x)))
# [1] a b
# <0 rows> (or 0-length row.names)

Upvotes: 1

Related Questions