Alexis
Alexis

Reputation: 2294

Additional columns for a summarised DataFrame

I want to add columns to a summarized data frame that counts a particular factor.

bookplace <- data.frame(type = c("reading", "reading", "reading", "reading", "lending", "lending"), 
                        sex = c("male", "female", "male", "female", "male", "female"), 
                        usage = c(103, 102, 23, 14, 16, 8), 
                        date = c("1/1/18","1/1/18","1/1/18","1/1/18","1/1/18","1/1/18"),
                        stringsAsFactors = FALSE)

The result should be (considering male and female as the added columns):

year  type    users  male  female
2018  lending    24    16       8
2018  reading   242   126     116

I tried using mutate to add the column and then summarize with the following code:

bookplace %>% 
  mutate(males=count(sex=="male"),
         females=count(sex=="female")) %>%
  group_by(year=format(date,"%Y"), type) %>% 
  summarize(users=sum(usage))

But I have the following error message:

Error in UseMethod("groups") : no applicable method for 'groups' applied to an object of class "logical"

Please, any help will be greatly appreciated.

Upvotes: 0

Views: 100

Answers (2)

Omar Abd El-Naser
Omar Abd El-Naser

Reputation: 704

Here's the answer using dplyr

bookplace <- data.frame(c("reading", "reading", "reading", 
                          "reading", "lending", "lending"), 
                        c("male", "female", "male", "female", "male", "female"), 
                        c(103, 102, 23, 14, 16, 8), 
                        c("1/1/18","1/1/18","1/1/18","1/1/18","1/1/18","1/1/18"))
colnames(bookplace) <- c("type","Gender","Usage","Year")
bookplace$Year <- as.Date(bookplace$Year, format = "%d/%m/%Y")
check <- bookplace%>%group_by(Year,type)%>%summarise(Users = sum(Usage),male = sum(Usage[ Gender =="male"]),
                                                     female = sum(Usage[Gender == "female"]))

i got the idea from this question Summarize with conditions in dplyr

Upvotes: 2

neilfws
neilfws

Reputation: 33782

A tidyverse solution. Assuming that dates are %m/%d/%y. If not, change the format string accordingly.

library(dplyr)
library(tidyr)

bookplace %>% 
  mutate(year = format(as.Date("1/1/18", "%m/%d/%y"), "%Y")) %>% 
  group_by(year, sex, type) %>% 
  summarise(Total = sum(usage)) %>% 
  ungroup() %>% 
  spread(sex, Total) %>% 
  mutate(users = female + male)

Result:

# A tibble: 2 x 5
  year  type    female  male users
  <chr> <chr>    <dbl> <dbl> <dbl>
1 2018  lending      8    16    24
2 2018  reading    116   126   242

Upvotes: 1

Related Questions