Maxxx
Maxxx

Reputation: 3768

Counting frequency

So i have an excel file with the format:

StudID      Score

2234          96
1056          20
9886          70
6542          65
4315          15
2234          40
6542          97
9886          56
4315          32
6542          54

and I'm trying to obtain the frequency of each occurrence of StudID. Where i would obtain:

StudID        Frequency

2234              2
1056              1
9886              2
4315              2
6542              3

Additionally, based on the above i would like to get the StudID with the largest frequency hence in this case it would be StudID 6542.

stud <- read.csv("student.csv")
freq <- table(stud$StudID)
colnames(freq) <- c("StudID", "Frequency")
freq[which.max(freq)]

but it seemed like I'm getting an error message saying:

Error in colnames<-(*tmp*, value = c("StudID", "Frequency")) : attempt to set 'colnames' on an object with less than two dimensions

Upvotes: 0

Views: 54

Answers (2)

Sotos
Sotos

Reputation: 51592

The error is telling you that yu are trying to assign colnames to an object with less than 2 dimensions. Indeed If we inspect the structure of table(), then we can see that it is a 1-dimensional object, i.e.

str(table(df$V1))
 'table' int [1:5**(1d)**] 1 2 2 2 2 #(1d = 1 dimension)
 - attr(*, "dimnames")=List of 1
  ..$ : chr [1:5] "1056" "2234" "4315" "6542" ...

What you want to do is convert to a data frame first and then assign names, i.e.

dd <- setNames(as.data.frame(table(df$V1)), c('StudID', 'Freq'))

#  StudID Freq
#1   1056    1
#2   2234    2
#3   4315    2
#4   6542    3
#5   9886    2

To extract the maximum, you can just do,

dd$StudID[which.max(dd$Freq)]
#[1] 6542
#Levels: 1056 2234 4315 6542 9886

DATA:

dput(df)
structure(list(V1 = c(2234L, 1056L, 9886L, 4315L, 2234L, 6542L, 
9886L, 4315L, 6542L, 6542L), V2 = c(96L, 20L, 70L, 15L, 40L, 
97L, 56L, 32L, 54L, 13L)), class = "data.frame", row.names = c(NA, 
-10L))

EDIT: To have it not return Levels as per your comment, we can simply convert to character, i.e.

dd$StudID <- as.character(dd$StudID)
dd$StudID[which.max(dd$Freq)]
#[1] 6542

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388982

In base R, we could use aggregate and then follow your which.max logic

freq <- aggregate(Score~StudID, df, length)
freq[which.max(freq$Score), ]

#  StudID Score
#4   6542     3

Or if you want only the ID

freq$StudID[which.max(freq$Score)]
#[1] 6542

Or with table

names(which.max(table(df$StudID)))
#[1] "6542"

Upvotes: 2

Related Questions