wageeh
wageeh

Reputation: 84

Grouping and summing similar rows in R

So I have this dataframe:

# A tibble: 268 x 7
 Age   Facebook_likes Instagram_likes Twitter_likes Tiktok_likes Reddit_likes
 <chr>          <dbl>           <dbl>         <dbl>        <dbl>        <dbl>
1 18-24              1               1             0            0            0
2 <18                0               0             0            0            0
3 18-24              1               1             1            0            0
4 18-24              0               0             0            0            0
5 18-24              0               0             0            0            0
6 25-34              0               1             0            0            0
7 18-24              1               1             0            0            0
8 18-24              0               1             0            0            0
9 25-34              0               0             0            0            1
10 18-24              1               0             0            0            0
# ... with 258 more rows, and 1 more variable:

the Age variable has only 4 observations ( <18, 18-24, 25-34, 35>). What I want to do is transform this dataframe such that I only have those 4 rows with the each variable being the sum. For example : the first grid ( first column x first row ) would have the sum of Facebook likes for those who are <18.. :

# 
   Age   Facebook_likes                     Instagram_likes Twitter_likes Tiktok_likes Reddit_likes
   <chr>          <dbl>                              <dbl>         <dbl>        <dbl>        <dbl>
 1 <18    sum(facebook_likes for those <18)               
 2 18-24                
 3 25-34            
 4 >35           
 

Upvotes: 1

Views: 35

Answers (2)

Yuriy Saraykin
Yuriy Saraykin

Reputation: 8880

data.table

library(data.table)

cols_likes <- grep("_likes$", names(df), value = TRUE)

or

cols_likes <- sapply(df, is.numeric)

setDT(df)[, lapply(.SD, sum, na.rm = TRUE), by = Age, .SDcols = cols_likes]

Upvotes: 1

akrun
akrun

Reputation: 887128

We can use summarise with across in tidyverse after grouping by 'Age'

library(dplyr)
df1 %>%
  group_by(Age) %>%
  summarise(across(where(is.numeric), sum, na.rm = TRUE))

Upvotes: 2

Related Questions