Reputation: 11
I am a beginner in R and would be very thankful for a response as I am stuck on this code (this is my attempt at solving the problem but it does not work):
personal_spotify_df <- fromJSON("data/StreamingHistory0.json")
personal_spotify_df = personal_spotify_df %>%
mutate(minutesPlayed = msPlayed/1000/60)
personal_spotify_df_ranked <- personal_spotify_df %>%
group_by(artistName) %>%
filter(top_n(15, max(nrows())))
I have a dataframe (see below for a screenshot on how its structured) which is my spotity listening history. I want to group this dataframe by artists and afterwards arrange the new dataframe to show the top 15 artists with the most songs listened to. I am stuck on how to get from grouping by artistName to actually filtering out the top 15 represented artists from the dataframe.
Upvotes: 0
Views: 87
Reputation: 388807
In base R, you can make use of table
, sort
and head
to get top 15 artists with their count
table(personal_spotify_df$artistName) |>
sort(decreasing = TRUE) |>
head(15) |>
stack()
The pipe operator (|>
) requires R 4.1 if you have a lower version use -
stack(head(sort(table(personal_spotify_df$artistName), decreasing = TRUE), 15))
Upvotes: 0
Reputation: 886938
We may use slice_max
, with n
specified as 15 and the order column created with add_count
library(dplyr)
personal_spotify_df %>%
add_count(artistName, name = "Count") %>%
slice_max(n = 15, order_by = "Count") %>%
select(-Count)
If we want to get only the top 15 distinct 'artistName',
personal_spotify_df %>%
count(artistName, name = "Count") %>%
slice_max(n = 15, order_by = "Count")
Or an option with filter
after arrange
ing the rows based on the count
personal_spotify_df %>%
add_count(artistName) %>%
arrange(desc(n)) %>%
filter(artistName %in% head(unique(artistName), 15))
Upvotes: 2