Reputation: 1362
Suppose I have a simple data frame
test_df <- data.frame(c(0,0,1,0,0,1,1,1,1,1),c(1,0,0,0,0,0,0,0,0,0))
I want to get which number (0 or 1) is the maximum for each row. In my example 1 for the first vector (6 occurrences), 0 for the second one (9 occurrences).
I started with:
> sapply(test_df,table)
c.0..0..1..0..0..1..1..1..1..1. c.1..0..0..0..0..0..0..0..0..0.
0 4 9
1 6 1
so far looks fine. Then
> sapply((sapply(test_df,table)),max)
[1] 4 6 9 1
I got lost, did I loose the associations? 1 -> 6 , 0 -> 9 What I want is to have returned a vector with the "winner": 1,0,...
1 for the first vector (6 occurrences)
0 for the second vector (9 occurrences)
...
Upvotes: 3
Views: 1117
Reputation: 26248
This can be done in one apply
statement. Although, it's unclear whether you want the maximum occurrences for each row or column, so here's both (using @akrun 's cleaner data set), returning a vector showing the 'winner' (either 1 or 0) for each row/column.
## Data
test_df <- data.frame(v1= c(0,0,1,0,0,1,1,1,1,1),
v2= c(1,0,0,0,0,0,0,0,0,0),
v3= c(1,0,0,0,0,0,0,0,0,1))
# v1 v2 v3
# 1 0 1 1
# 2 0 0 0
# 3 1 0 0
# 4 0 0 0
# 5 0 0 0
# 6 1 0 0
# 7 1 0 0
# 8 1 0 0
# 9 1 0 0
# 10 1 0 1
## Solution - For each row
apply(test_df, 1, function(x) { sum(sum(x == 1) > sum(x == 0)) })
## Result
# [1] 1 0 0 0 0 0 0 0 0 1
## Solution - For each column
apply(test_df, 2, function(x) { sum(sum(x == 1) > sum(x == 0)) })
## Result
# v1 v2 v3
# 1 0 0
Upvotes: 2
Reputation: 886938
We can use apply
with MARGIN=1
to extract the max
value from each row of the sapply
output.
frqCol <- sapply(test_df, table)
apply(frqCol, 1, max)
# 0 1
# 9 6
or use rowMaxs
from matrixStats
library(matrixStats)
rowMaxs(frqCol)
#[1] 9 6
If we need the 'max' value per column
apply(frqCol, 2, max)
and
colMaxs(frqCol)
With the new example
test_df <- data.frame(v1= c(0,0,1,0,0,1,1,1,1,1),
v2= c(1,0,0,0,0,0,0,0,0,0),
v3= c(1,0,0,0,0,0,0,0,0,1))
frqCol <- sapply(test_df, table)
apply(frqCol, 2, max)
#v1 v2 v3
#6 9 8
colMaxs(frqCol)
#[1] 6 9 8
Upvotes: 2