making a dataset of multiple x values in one y value

Question

I have a correlation dataset that looks like this:

V1    V2    R2    
 1    2    0.4    
 1    3    0.5    
 3    5    0.3

And i want to convert it to a two-column data in such a way that I would have multiple x (in column V) in one y (in column R2) for scatter plotting. It would look like this:

How can I do this in R?

alistaire · Accepted Answer

In the tidyverse, you can make a list column of the required vectors with purrr::map2 to iterate seq over each pair of start and end points, and then expand with tidyr::unnest:

df <- data.frame(V1 = c(1L, 1L, 3L), 
                 V2 = c(2L, 3L, 5L), 
                 R2 = c(0.4, 0.5, 0.3))

library(tidyverse)

df %>% transmute(V = map2(V1, V2, seq), R2) %>% unnest()
#>    R2 V
#> 1 0.4 1
#> 2 0.4 2
#> 3 0.5 1
#> 4 0.5 2
#> 5 0.5 3
#> 6 0.3 3
#> 7 0.3 4
#> 8 0.3 5

In base R, there isn't a simple equivalent of unnest, so it's easier to use Map (the multivariate lapply, roughly equivalent to purrr::map2 above) to build a list of data frames, complete with the R2 value (recycled by data.frame), which than then be do.call(rbind, ...)ed into a single data frame:

do.call(rbind, 
        Map(function(v1, v2, r2){data.frame(V = v1:v2, R2 = r2)}, 
            df$V1, df$V2, df$R2))
#>   V  R2
#> 1 1 0.4
#> 2 2 0.4
#> 3 1 0.5
#> 4 2 0.5
#> 5 3 0.5
#> 6 3 0.3
#> 7 4 0.3
#> 8 5 0.3

Check out the intermediate products of each to get a feel for how they work.

making a dataset of multiple x values in one y value

Answers (2)

Related Questions