Scott Jackson
Scott Jackson

Reputation: 431

Appending columns/variables from a data frame into a new variable

I've been searching for how to do this but cannot seem to find an example for my questions. I'm pretty new to R but am very familiar with SAS, so I wanted to ask how to do the equivalent of this SAS code in R.

I have one dataset (cohort), and two variables (last_pre_cv_prob, first_post_cv_prob), and I want to make a new dataset that has two variables, the first of which is the two previous variables set underneath each other (cv_prob), the second is a variable indicating which variable the data came from (time). So in SAS I would simply do this:

data post_cv;
    set cohort(keep=last_pre_cv_prob rename=(last_pre_cv_prob=cv_prob) in=a)
    cohort(keep=first_post_cv_prob rename=(first_post_cv_prob=cv_prob) in=b);
    if b then time='post';
    if a then time='pre';
run;

How would I do this in R?

Thanks!

edit:

post_cv2 %>% gather(column, prob, last_pre_cv_prob, first_post_cv_prob)

Error in eval(expr, envir, enclos) : object 'last_pre_cv_prob' not found

Then I tried:

post_cv2 %>% gather(column, prob, liver_cv$last_pre_cv_prob, 
liver_cv$first_post_cv_prob)

Error: All select() inputs must resolve to integer column positions.
The following do not:
*  liver_cv$last_pre_cv_prob
*  liver_cv$first_post_cv_prob

edit:

Second issue resolved, I had to add the little quote marks

post_cv2 <- post_cv %>% 
  gather(time, cv_prob, `liver_cv$last_pre_cv_prob`, 
`liver_cv$first_post_cv_prob`) 

edit:

Solved!

library(tidyverse)
library(stringr)

post_cv <- data_frame(pre = liver_cv$last_pre_cv_prob, post = liver_cv$first_post_cv_prob)

post_cv2 <- post_cv %>% 
  gather(time, cv_prob, pre, post) 

Upvotes: 0

Views: 46

Answers (1)

yeedle
yeedle

Reputation: 5008

You can simply gather the 2 columns and extract the time information:


library(tidyverse)

cohort <- data_frame(last_pre_cv_prob = runif(5),
                     first_post_cv_prob = runif(5))

cohort_2 <- cohort %>% 
  gather(time, cv_prob, last_pre_cv_prob, first_post_cv_prob) %>%
  mutate(time = str_extract(time, "post|pre"))

cohort_2
#> # A tibble: 10 × 2
#>     time    cv_prob
#>    <chr>      <dbl>
#> 1    pre 0.64527372
#> 2    pre 0.55086818
#> 3    pre 0.05882369
#> 4    pre 0.19626147
#> 5    pre 0.05933594
#> 6   post 0.25564350
#> 7   post 0.01908338
#> 8   post 0.84901506
#> 9   post 0.07761842
#> 10  post 0.29019190

Upvotes: 2

Related Questions