Reputation: 1625
I have a data.frame with a head that looks like this:
> head(movies_by_yr)
Source: local data frame [6 x 4]
Groups: YR_Released [6]
Movie_Title YR_Released Rating Num_Reviews
<fctr> <fctr> <dbl> <int>
1 The Shawshank Redemption 1994 9.2 1773755
2 The Godfather 1972 9.2 1211083
3 The Godfather: Part II 1974 9.0 832342
4 The Dark Knight 2008 8.9 1755341
5 12 Angry Men 1957 8.9 477276
6 Schindler's List 1993 8.9 909358
Note that when created, I specified stringsAsFactors=FALSE
, so I believe the columns that got converted to factors were converted when I grouped the data frame in preparation for the next step:
movies_by_yr <- group_by(problem1_data, YR_Released)
Now we come to the problem. The goal is to group by YR_Released so we can get counts of records by year. I thought the next step would be something like this, but it throws an error and I am not sure what i am doing wrong:
summarise(movies_by_yr, total = nrow(YR_Released))
I choose nrow
because once you have a grouping, the number of rows within that grouping should be the count. Can someone point me to what I am doing wrong?
The error thrown is:
Error in summarise_impl(.data, dots) : Not a vector
But I know this data.frame was created from a series of vectors and whatever is different from the sample code from class and my attempt, I am just not seeing it. Hoping someone can answer this ...
Upvotes: 1
Views: 1079
Reputation: 23014
Let's use data that everyone has, like the built-in mtcars
data.frame, to make this more useful for future readers.
If you look at the documentation ?nrow
you'll see that function is meant to be called on a data.frame or matrix. You are calling it on a column, YR_Released
. There is a vector-specific variant of the function nrow
, called (confusingly) NROW
- if you try that instead, it may work.
But even if it does, the intended dplyr way to count rows is with n()
, like this:
mycars <- mtcars
mycars <- group_by(mycars, cyl)
summarise(mycars, total = NROW(cyl))
#> # A tibble: 3 x 2
#> cyl total
#> <dbl> <int>
#> 1 4 11
#> 2 6 7
#> 3 8 14
And because it's such a common use case, the wrapper function count()
will save you some code:
mtcars %>%
count(cyl)
Upvotes: 1
Reputation: 1
Try this (I think it's what you want)
table(movies_by_year$YR_Released)
Upvotes: 0