Mleinss
Mleinss

Reputation: 37

How to create a tibble with repeated and unrepeated measures of several variables in r?

I would like to create a tibble with a mix of repeated and unrepeated measures of variables along with the dates when they were measured;

To create the tibble, I could write vectors with every combination of variable-date and measurement, but I am wondering if there is a way to do this more efficiently, since 3 variables are repeatedly taken for 16 times. I've written this chunk of code with variables, dates and measurements to start with, but I'm stuck there. Any suggestions?

library (tidyverse)
variables <- c(var1, var2, var3, var4, var5, var6, var7, var8, var9)
mydates <- c(2013-04-15,
             2013-04-16,
             2013-04-17,
             2013-04-22,
             2013-04-25,
             2013-04-29,
             2013-05-02,
             2013-05-06,
             2013-05-09,
             2013-05-13,
             2013-05-16,
             2013-05-20,
             2013-05-23,
             2013-05-27,
             2013-05-30,
             2013-06-03,
             2013-07-04,  
             2013-07-08)
measurements <- c(3.2, 4.6, 1.1, 3.0, 3.6, 1.6, 1.4, 1.4, 4.8, 3.5, 4.0, 
2.7, 1.4, 2.9, 2.4, 3.6, 3.7, 4.3, 3.6, 3.5, 4.7, 1.8, 3.5, 2.4, 2.1, 1.2,
2.3, 3.9, 1.6, 2.8, 5.0, 2.4, 2.2, 2.9, 1.8, 1.7, 4.4, 3.9, 4.4, 2.6, 1.7, 
4.2, 3.4, 4.4, 4.7, 5.0, 3.0, 3.7, 2.1, 2.9, 4.5, 1.5, 2.2, 2.9)

tibble (variables, mydates, measurements)

I would like a tibble that looks like this, with my first 3 variables each repeated 16 times, my 16 first dates each repeated 3 times and the measurements:

variables   mydates     measurements
var1        2013-04-15  3.2
var2        2013-04-15  4.6
var3        2013-04-15  1.1
var1        2013-04-16  3.0
var2        2013-04-16  3.6
var3        2013-04-16  1.6
var1        2013-04-17  1.4
var2        2013-04-17  1.4
var3        2013-04-17  4.8
...         ...         ...  # measurements for var1, var2, var3 were repeatedly taken during the 16 first dates in the vector mydates.
var4        2013-07-04  2.1
var5        2013-07-04  2.9
var6        2013-07-04  4.5
var7        2013-07-04  1.5
var8        2013-07-08  2.2
var9        2013-07-08  2.9

Upvotes: 0

Views: 155

Answers (1)

patL
patL

Reputation: 2299

Here is an (bit 'dirty') alternative using base::expand.grid and lubridate. I transformed your mydates vector into a date class object.

Once you get all combinations between variables and mydates, you can bind measurements and transform into a tibble using as_data_frame.

library (tidyverse)
library(lubridate) 


variables <- c("var1", "var2", "var3", "var4", "var5", "var6", "var7", "var8", "var9")

mydates <- c("2013-04-15",
             "2013-04-16",
             "2013-04-17",
             '2013-04-22',
             '2013-04-25',
             '2013-04-29',
             "2013-05-02",
             "2013-05-06",
             "2013-05-09",
             "2013-05-13",
             "2013-05-16",
             "2013-05-20",
             "2013-05-23",
             "2013-05-27",
             "2013-05-30",
             "2013-06-03",
             "2013-07-04",  
             "2013-07-08") %>% 
  as_date()

measurements <- c(3.2, 4.6, 1.1, 3.0, 3.6, 1.6, 1.4, 1.4, 4.8, 3.5, 4.0, 
                  2.7, 1.4, 2.9, 2.4, 3.6, 3.7, 4.3, 3.6, 3.5, 4.7, 1.8, 3.5, 2.4, 2.1, 1.2,
                  2.3, 3.9, 1.6, 2.8, 5.0, 2.4, 2.2, 2.9, 1.8, 1.7, 4.4, 3.9, 4.4, 2.6, 1.7, 
                  4.2, 3.4, 4.4, 4.7, 5.0, 3.0, 3.7, 2.1, 2.9, 4.5, 1.5, 2.2, 2.9)



mydata <- expand.grid(vars = variables[1:3], 
                      dates = mydates[month(mydates) < 7]) %>% 
  rbind(expand.grid(vars = variables[4:7], 
                    dates = mydates[month(mydates) == 7 & day(mydates) == 4])) %>% 
  rbind(expand.grid(vars = variables[8:9], 
                    dates = mydates[month(mydates) == 7 & day(mydates) == 8])) %>% 
  mutate(measures = measurements) %>% 
  as_data_frame()

And the output would be:

mydata

## A tibble: 54 x 3
#   vars  dates      measures
#   <fct> <date>        <dbl>
# 1 var1  2013-04-15      3.2
# 2 var2  2013-04-15      4.6
# 3 var3  2013-04-15      1.1
# 4 var1  2013-04-16      3  
# 5 var2  2013-04-16      3.6
# 6 var3  2013-04-16      1.6
# 7 var1  2013-04-17      1.4
# 8 var2  2013-04-17      1.4
# 9 var3  2013-04-17      4.8
#10 var1  2013-04-22      3.5
## ... with 44 more rows

Upvotes: 1

Related Questions