Nauman Shahid
Nauman Shahid

Reputation: 377

How do I sort an as.data.frame table

To try and get the frequency of variable within a column, I used the following code:

s = table(students$Sport)
t = as.data.frame(s)
names(t)[1] = 'Sport'
t

Although this works, it gives me a massive list that is not sorted, such as this:

1            Football             20310
2            Rugby                80302
3            Tennis               5123
4            Swimming             73132
…            …                    … 
68           Basketball           90391

How would I go about sorting this table, so that the most frequent sport is at the top. Also, is there a way to only display the top 5 options? Rather than all 68 different sports?

Or, alternatively, if there's a better way to approach this.

Any help would be appreciated!

Upvotes: 0

Views: 216

Answers (2)

Sumanth Rao
Sumanth Rao

Reputation: 378

You can use the plyr packages count function to count the words and frequency. A more elegant way of doing it compared to converting it to a dataframe.

library(plyr)
d<-count(students,"Sport") #convert it to a dataframe first before using count.

Order function helps you to order the output. using the - makes in sort in descending order. [1:5] gives you the top 5 rows. You can remove it if you want all entries.

d[order(-d$freq)[1:5],]

Upvotes: 1

Fernando Silva
Fernando Silva

Reputation: 182

you can use dplyr and do it all in a single line, below an example

library(dplyr)
students = data.frame(sport = c(rep("Football", 200), 
                            rep("Rugby", 130), 
                            rep("Tennis", 100), 
                            rep("Swimming", 40),
                            rep("Basketball", 10),
                            rep("Baseball", 300),
                            rep("Gimnastics", 70)
                            )
                  )
students %>% group_by(sport) %>% summarise( n = length(sport)) %>% arrange(desc(n)) %>% top_n(5, n)

# A tibble: 5 x 2
sport          n
  <fct>      <int>
1 Baseball     300
2 Football     200
3 Rugby        130
4 Tennis       100
5 Gimnastics    70

Upvotes: 1

Related Questions