Reputation: 84
So I have this dataframe:
# A tibble: 268 x 7
Age Facebook_likes Instagram_likes Twitter_likes Tiktok_likes Reddit_likes
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 18-24 1 1 0 0 0
2 <18 0 0 0 0 0
3 18-24 1 1 1 0 0
4 18-24 0 0 0 0 0
5 18-24 0 0 0 0 0
6 25-34 0 1 0 0 0
7 18-24 1 1 0 0 0
8 18-24 0 1 0 0 0
9 25-34 0 0 0 0 1
10 18-24 1 0 0 0 0
# ... with 258 more rows, and 1 more variable:
the Age variable has only 4 observations ( <18, 18-24, 25-34, 35>). What I want to do is transform this dataframe such that I only have those 4 rows with the each variable being the sum. For example : the first grid ( first column x first row ) would have the sum of Facebook likes for those who are <18.. :
#
Age Facebook_likes Instagram_likes Twitter_likes Tiktok_likes Reddit_likes
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 <18 sum(facebook_likes for those <18)
2 18-24
3 25-34
4 >35
Upvotes: 1
Views: 35
Reputation: 8880
data.table
library(data.table)
cols_likes <- grep("_likes$", names(df), value = TRUE)
or
cols_likes <- sapply(df, is.numeric)
setDT(df)[, lapply(.SD, sum, na.rm = TRUE), by = Age, .SDcols = cols_likes]
Upvotes: 1
Reputation: 887128
We can use summarise
with across
in tidyverse
after grouping by 'Age'
library(dplyr)
df1 %>%
group_by(Age) %>%
summarise(across(where(is.numeric), sum, na.rm = TRUE))
Upvotes: 2