Reputation: 39595
I am having problems with function max()
in the extraction of maximal value from a group of variables. The data.frame is the next and all variables are numeric:
setosa versicolor virginica
0 0.96969697 0.03030303
0 0.05128205 0.94871795
0 0.96969697 0.03030303
1 0.00000000 0.00000000
1 0.00000000 0.00000000
0 0.05128205 0.94871795
0 0.05128205 0.94871795
0 0.05128205 0.94871795
When I apply max()
function to this data frame and I try to save it in a new variable it happens:
DF$max=max(DF$setosa,DF$versicolor,DF$virginica)
setosa versicolor virginica max
0 0.96969697 0.03030303 1
0 0.05128205 0.94871795 1
0 0.96969697 0.03030303 1
1 0.00000000 0.00000000 1
1 0.00000000 0.00000000 1
0 0.05128205 0.94871795 1
0 0.05128205 0.94871795 1
0 0.05128205 0.94871795 1
It seems max()
function round the maximal value. I can't find my mistake, can you help me what is wrong. Thanks.
Upvotes: 3
Views: 2658
Reputation: 18437
You can use pmax
for that
set.seed(123)
dat <- data.frame(matrix(rnorm(15), ncol = 3))
cbind(dat,
max = pmax(dat$X1, dat$X2, dat$X3)
)
## X1 X2 X3 max
## 1 0.42646 0.688640 -0.69471 0.68864
## 2 -0.29507 0.553918 -0.20792 0.55392
## 3 0.89513 -0.061912 -1.26540 0.89513
## 4 0.87813 -0.305963 2.16896 2.16896
## 5 0.82158 -0.380471 1.20796 1.20796
Upvotes: 3
Reputation: 2455
You statement gets the value for the maximum of all elements. Try to use apply
:
R > dat$max <- apply(dat, 1, max)
R > dat
setosa versicolor virginica max
1 0 0.96969697 0.03030303 0.969697
2 0 0.05128205 0.94871795 0.948718
3 0 0.96969697 0.03030303 0.969697
4 1 0.00000000 0.00000000 1.000000
5 1 0.00000000 0.00000000 1.000000
6 0 0.05128205 0.94871795 0.948718
7 0 0.05128205 0.94871795 0.948718
8 0 0.05128205 0.94871795 0.948718
Upvotes: 3
Reputation: 59970
max
returns a single value that is the maximum of all the arguments submitted to it. So the max value across all three columns in your data is 1 which is what `max returns:
max(df$setosa,df$versicolor,df$virginica)
[1] 1
You then assign it to a new column in your data.frame, and due to the way R is designed recycling on the assignment occurs such that the value returned from max is reused until the size of the vector it is being assigned to is full, in this case, the number of rows in your data frame.
If you want the max of each column, do
apply( df , 2 , max )
setosa versicolor virginica
1.000000 0.969697 0.948718
Which applies the max
function to each column and returns the result. If you want to know which row contains the max
value for each column use which.max
like so
apply( df , 2 , which.max )
setosa versicolor virginica
4 1 2
And if you want the max
across the values by row, set the MARGIN argument to apply
to be 1 (here the MARGIN argument is set using positional matching rather than being named explicitly):
df$max <- apply( df , 1 , max )
Upvotes: 1