Reputation: 125
I have a dataframe which looks like the following:
Type June 23 June 24 July 23 July 24 August 23 August 24
A 8 2 20 20 30 25
B 8 2 20 20 30 25
Now I want to have the sum over the columns per month (so disregarding the week number). So, the result should look like this
Type June July August
A 10 40 55
B 10 40 55
I try do this this using dplyr, but I do not know how to match only the first part of the column names. Does anyone have an idea how to do this?
Upvotes: 1
Views: 193
Reputation: 9277
Another possible solution implying the tidyverse
universe could be
library(stringr)
library(tidyr)
library(dplyr)
dat %>%
pivot_longer(cols = -Type) %>%
mutate(Month = str_extract(name, "\\D+")) %>%
group_by(Type, Month) %>%
summarise(Sum = sum(value)) %>%
pivot_wider(id_cols = Type, names_from = Month, values_from = Sum)
Output
# A tibble: 2 x 4
# Groups: Type [2]
# Type August July June
# <chr> <int> <int> <int>
# 1 A 55 40 10
# 2 B 55 40 10
Upvotes: 2
Reputation: 389275
In base R, you can use split.default
to split data based on similar part of column names.
In your example, if we remove numbers from column names we get the month name, pass it to split.default
and sum each part using rowSums
.
cbind(df[1],sapply(split.default(df[-1], sub('\\d+', '',names(df[-1]))), rowSums))
# Type August July June
#1 A 55 40 10
#2 B 55 40 10
data
df <- structure(list(Type = c("A", "B"), June23 = c(8L, 8L), June24 = c(2L,
2L), July23 = c(20L, 20L), July24 = c(20L, 20L), August23 = c(30L,
30L), August24 = c(25L, 25L)), class = "data.frame", row.names = c(NA, -2L))
Upvotes: 2