Reputation: 939
Considering this tibble
library(tibble)
mytab <- tribble(
~siren_ent, ~nbeta_dep01, ~nbeta_dep02, ~effeta_dep01, ~effeta_dep02, ~categ,
"A", 3, 0, 50, 0, "X",
"B", 0, 2, 0, 30, "X",
"C", 1, 1, 10, 15, "Y"
)
I would like to pivot it to get this result...
> result
# A tibble: 4 x 4
DEP categ nbeta effeta
<chr> <chr> <dbl> <dbl>
1 01 X 3 50
2 01 X 1 10
3 02 Y 2 30
4 02 Y 1 15
If you have a solution with tidyverse, I'm interested.
Many thanks in advance !
Upvotes: 1
Views: 55
Reputation: 887531
Using melt
from data.table
library(data.table)
melt(setDT(mytab)[, siren_ent := NULL],
id.var = 'categ', measure = patterns("^nbeta", "^effeta"),
value.name = c('nbeta', 'effeta'), variable.name = 'dep')[nbeta != 0]
# categ dep nbeta effeta
#1: X 1 3 50
#2: Y 1 1 10
#3: X 2 2 30
#4: Y 2 1 15
Upvotes: 1
Reputation: 4658
I assume the second of your effeta_dep01
columns should, in fact, be called effeta_dep02
. If so, then this gives something close to the desired output:
library(tidyr)
library(dplyr)
mytab %>%
pivot_longer(nbeta_dep01:effeta_dep02,
names_sep = "_dep",
names_to = c(".value", "dep")) %>%
filter(nbeta > 0) %>%
select(-siren_ent)
which gives
# A tibble: 4 x 4
categ dep nbeta effeta
<chr> <chr> <dbl> <dbl>
1 X 01 3 50
2 X 02 2 30
3 Y 01 1 10
4 Y 02 1 15
I assume your expected output is incorrect, but if not, I'd be happy to know what transform you expect.
Upvotes: 3
Reputation: 39613
Try this also:
library(tidyverse)
#Code
new <- mytab %>% pivot_longer(-c(siren_ent,categ)) %>%
separate(name,sep='_',c('var','index')) %>%
select(-index) %>%
group_by(siren_ent,categ,var) %>%
mutate(id=row_number()) %>%
pivot_wider(names_from = var,values_from=value) %>%
ungroup() %>% select(-id) %>%
filter(nbeta>0)
Output:
# A tibble: 4 x 4
siren_ent categ nbeta effeta
<chr> <chr> <dbl> <dbl>
1 A X 3 50
2 B X 2 30
3 C Y 1 10
4 C Y 1 15
Upvotes: 1
Reputation: 1724
There is a duplicate column name in your example data and also several zeroes that do not appear in the final result. By combining two pivots and a group_by we can reach the desired result:
library(tidyr)
library(dplyr)
library(readr)
mytab %>% select(-siren_ent) %>%
pivot_longer(-categ) %>%
separate(name, '_', into = c('name','dep')) %>%
mutate(dep = parse_number(dep)) %>%
group_by(categ, dep, name) %>%
summarise(value = max(value)) %>%
pivot_wider(names_from = 'name', values_from = 'value')
# A tibble: 4 x 4
# Groups: categ, dep [4]
categ dep effeta nbeta
<chr> <dbl> <dbl> <dbl>
1 X 1 50 3
2 X 2 30 2
3 Y 1 10 1
4 Y 2 15 1
Upvotes: 1