Reputation: 3768
So i have an excel file with the format:
StudID Score
2234 96
1056 20
9886 70
6542 65
4315 15
2234 40
6542 97
9886 56
4315 32
6542 54
and I'm trying to obtain the frequency of each occurrence of StudID. Where i would obtain:
StudID Frequency
2234 2
1056 1
9886 2
4315 2
6542 3
Additionally, based on the above i would like to get the StudID with the largest frequency hence in this case it would be StudID 6542.
stud <- read.csv("student.csv")
freq <- table(stud$StudID)
colnames(freq) <- c("StudID", "Frequency")
freq[which.max(freq)]
but it seemed like I'm getting an error message saying:
Error in
colnames<-
(*tmp*
, value = c("StudID", "Frequency")) : attempt to set 'colnames' on an object with less than two dimensions
Upvotes: 0
Views: 54
Reputation: 51592
The error is telling you that yu are trying to assign colnames
to an object with less than 2 dimensions. Indeed If we inspect the structure of table()
, then we can see that it is a 1-dimensional object, i.e.
str(table(df$V1))
'table' int [1:5**(1d)**] 1 2 2 2 2 #(1d = 1 dimension)
- attr(*, "dimnames")=List of 1
..$ : chr [1:5] "1056" "2234" "4315" "6542" ...
What you want to do is convert to a data frame first and then assign names, i.e.
dd <- setNames(as.data.frame(table(df$V1)), c('StudID', 'Freq'))
# StudID Freq
#1 1056 1
#2 2234 2
#3 4315 2
#4 6542 3
#5 9886 2
To extract the maximum, you can just do,
dd$StudID[which.max(dd$Freq)]
#[1] 6542
#Levels: 1056 2234 4315 6542 9886
DATA:
dput(df)
structure(list(V1 = c(2234L, 1056L, 9886L, 4315L, 2234L, 6542L,
9886L, 4315L, 6542L, 6542L), V2 = c(96L, 20L, 70L, 15L, 40L,
97L, 56L, 32L, 54L, 13L)), class = "data.frame", row.names = c(NA,
-10L))
EDIT: To have it not return Levels as per your comment, we can simply convert to character, i.e.
dd$StudID <- as.character(dd$StudID)
dd$StudID[which.max(dd$Freq)]
#[1] 6542
Upvotes: 2
Reputation: 388982
In base R, we could use aggregate
and then follow your which.max
logic
freq <- aggregate(Score~StudID, df, length)
freq[which.max(freq$Score), ]
# StudID Score
#4 6542 3
Or if you want only the ID
freq$StudID[which.max(freq$Score)]
#[1] 6542
Or with table
names(which.max(table(df$StudID)))
#[1] "6542"
Upvotes: 2