Konfu Chicken
Konfu Chicken

Reputation: 127

Using dplyr to get counts

I want to be able to get the counts, standard deviation and mean of certain variables after grouping them. I am able to get the mean and std, but getting the counts is giving me an error. This is the following code I have:

NYC_Trees %>%
    group_by(Condition) %>%
    dplyr::summarise(mean = round(mean(Compensatory.Value), 2),
                     sd   = round(sd(Compensatory.Value), 2), 
                     count(NYC_Trees,Condition, wt = Compensatory.Value))

I get the error: cannot handle.

I want the output such as:

Condition    Native     N     Mean    Std

What am I doing wrong?

Upvotes: 0

Views: 391

Answers (1)

shadow
shadow

Reputation: 22333

count(NYC_Trees,Condition, wt = Compensatory.Value) should be the same as NYC_Trees %>% group_by(Condition) %>% summarise(n = sum(Compensatory.Value). This clearly returns a vector and therefore the summarise function cannot handle it.

So you could just have the line n = sum(Compensatory.Value) inside the summarise:

NYC_Trees %>%
    group_by(Condition) %>%
    dplyr::summarise(mean = round(mean(Compensatory.Value), 2),
                     sd   = round(sd(Compensatory.Value), 2), 
                     n = sum(Compensatory.Value))

Is that what you are trying to do? If you just want the number of values in each group, you can use n = n() instead:

NYC_Trees %>%
    group_by(Condition) %>%
    dplyr::summarise(mean = round(mean(Compensatory.Value), 2),
                     sd   = round(sd(Compensatory.Value), 2), 
                     n = n())

Upvotes: 1

Related Questions