Reputation: 43
assuming I have a data set that looks something like this:
df <- data.frame("age" = c(55, 21, 11),
"sex" = c("m", "m", "f"),
"a2" = c(0,1,0),
"a3" = c(0,0,0),
"a4" = c(0,1,0),
"b2" = c(0,0,1),
"b3" = c(0,0,0),
"b4" = c(1,0,1),
"c2" = c(0,1,0)
)
age sex a2 a3 a4 b2 b3 b4 c2
55 m 0 0 0 0 0 1 0
21 m 1 0 1 0 0 0 1
11 f 0 0 0 1 0 1 0
How would I be able to transform it into long format and change the values of columns a1-c1 so that every column has the value of its name? e. g. in a2 0=1 and 1=2?
I tried it with pivot_longer like this:
df %>% pivot_longer(
cols = stars_with("a"),
names_to = "A"
)
My desired output looks like this:
age sex a2 a3 a4 b2 b3 b4 c2
55 m 1 1 1 1 1 4 1
21 m 2 1 4 1 3 1 2
11 f 1 1 1 2 1 4 1
Thanks!
Upvotes: 4
Views: 341
Reputation: 388982
Base R option using Map
:
inds <- -(1:2)
cols <- as.integer(sub('[a-z](\\d+)', '\\1', names(df)[inds]))
df[inds] <- Map(function(x, y) ifelse(x == 1, y, 1), df[inds], cols)
df
# age sex a2 a3 a4 b2 b3 b4 c2
#1 55 m 1 1 1 1 1 4 1
#2 21 m 2 1 4 1 1 1 2
#3 11 f 1 1 1 2 1 4 1
Upvotes: 2
Reputation: 26218
pivot_longer
is not appropriate strategy here. Follow mutate
with across.
across
will pass desired columns into mutate statementcur_column()
will pass current colnames into syntaxlibrary(dplyr)
library (stringr)
df %>% mutate(across(-c("age", "sex"), ~ ifelse(. == 0, 1, as.numeric(str_extract(cur_column(), "\\d$")))))
age sex a2 a3 a4 b2 b3 b4 c2
1 55 m 1 1 1 1 1 4 1
2 21 m 2 1 4 1 1 1 2
3 11 f 1 1 1 2 1 4 1
Upvotes: 4
Reputation: 8506
In base R you could do:
# extract numbers from column names
vals <- as.numeric(gsub("[^0-9]+", "", colnames(df)[-c(1:2)]))
# replace 0 with 1 and 1 with extracted number in each column
df[, -c(1:2)] <- sapply(seq_along(vals), function(x) ifelse(df[, x+2]==0, 1, vals[x]))
Upvotes: 2
Reputation: 1080
This is a good question because you cannot access column names within mutate_at
, which is how you would normally mutate across a list of variables. Here is a solution using map_dfc
from purrr
:
library(purrr)
library(dplyr)
library(tidyr)
library(readr)
We define a function (although not necessary but helpful to avoid foreign syntax if you are not accustomed to purrr
.
my_fun <- function(x) transmute(df, !!x := ifelse(!!sym(x) == 0, 1, parse_number(x)))
transmute
to avoid duplicate columns
!!x :=
is a way to assign a column name based on a string from a variable. In our case, a column name.
!!sym()
converts our variable name from a string to a symbol. Without this, for column 'a2' this would read "a2" == 0
thus replacing with the column number. Refer to this answer for an in-depth explanation.
parse_number
will find the number within the column name
ifelse
states if the column value is equal to 0, then the value equals 1, otherwise the value is equal to the number in the column name
var_names <- c("a2","a3","a4","b2","b3","b4","c2")
map_dfc(var_names, my_fun)
Upvotes: 4
Reputation: 194
One possibility (though may not be the most efficient)
library(tidyverse)
as_tibble(df) %>% transmute(
a2 = case_when(a2 == 1 ~ 2, a2 == 0 ~ 1),
a3 = case_when(a3 == 1 ~ 3, a3 == 0 ~ 1),
a4 = case_when(a4 == 1 ~ 4, a4 == 0 ~ 1),
b2 = case_when(b2 == 1 ~ 2, b2 == 0 ~ 1),
b3 = case_when(b3 == 1 ~ 3, b3 == 0 ~ 1),
b4 = case_when(b4 == 1 ~ 4, b4 == 0 ~ 1),
c2 = case_when(c2 == 1 ~ 2, c2 == 0 ~ 1)
)
If you wish to keep the other columns use mutate()
instead of transmute()
.
Upvotes: 0