Reputation: 7517
I'm trying to make 4 variables (id, uerate, lnw, exper
) in my data.frame (d
) into long format.
I also want to add two binary (0/1
) columns called DL
and DE
; if the long-formatted value
represents lnw
the DL==1
and DE==0
and vice versa.
I was wondering how to correctly specify tidyr::pivot_longer()
to obtain my EXPECTED OUTPUT below?
d <- read.csv("https://stats.idre.ucla.edu/wp-content/uploads/2016/02/wages_pp-1.txt")
select(d, id, uerate, lnw, exper) %>% pivot_longer(everything()) # tried without success
#### EXPECTED OUTPUT:
## id uerate variable value DE DL
## 1 31 3.21 lnw 1.491 0 1
## 2 31 3.21 lnw 1.433 0 1
## 3 31 3.21 lnw 1.469 0 1
## 4 31 3.29 lnw 1.749 0 1
## 5 31 2.90 lnw 1.931 0 1
## 6 31 2.50 lnw 1.709 0 1
## 7 31 2.60 lnw 2.086 0 1
## 8 31 4.79 lnw 2.129 0 1
## 6403 31 3.21 exper 0.015 1 0
## 6404 31 3.21 exper 0.715 1 0
## 6405 31 3.21 exper 1.734 1 0
## 6406 31 3.29 exper 2.773 1 0
## 6407 31 2.90 exper 3.927 1 0
## 6408 31 2.50 exper 4.946 1 0
## 6409 31 2.60 exper 5.965 1 0
## 6410 31 4.79 exper 6.984 1 0
Upvotes: 1
Views: 53
Reputation: 887991
We select
the columns of interest, and instead of everything()
in the columns to select with cols
, it should select columns other than 'id', 'uerate' (-c(id, uerate)
), then mutate
to create new columns 'DE' and 'DL'
library(dplyr)
library(tidyr)
d %>%
select(id, uerate, lnw, exper) %>%
pivot_longer(cols = -c(id, uerate), names_to = 'variable') %>%
mutate(DE = +(variable == 'exper'), DL = +(!DE)) %>%
arrange(id, variable)
-output
# A tibble: 12,804 x 6
# id uerate variable value DE DL
# <int> <dbl> <chr> <dbl> <int> <int>
# 1 31 3.22 exper 0.015 1 0
# 2 31 3.22 exper 0.715 1 0
# 3 31 3.22 exper 1.73 1 0
# 4 31 3.30 exper 2.77 1 0
# 5 31 2.90 exper 3.93 1 0
# 6 31 2.50 exper 4.95 1 0
# 7 31 2.60 exper 5.96 1 0
# 8 31 4.80 exper 6.98 1 0
# 9 31 3.22 lnw 1.49 0 1
#10 31 3.22 lnw 1.43 0 1
# … with 12,794 more rows
Upvotes: 1