Mons2us
Mons2us

Reputation: 332

How can I use merge so that I have data for all times?

I'm trying to change a data into which all entities have value for all possible times(months). Here's what I'm trying;

Class  Value  month
  A      10     1
  A      12     3
  A      9      12
  B      11     1
  B      10     8

From the data above, I want to get the following data;

Class  Value  month
  A      10     1
  A      NA     2
  A      12     3
  A      NA     4
        ....
  A      9      12
  B      11     1
  B      NA     2
        ....
  B      10     8
  B      NA     9
        ....
  B      NA     12

So I want to have all possible cells with through month from 1 to 12; How can I do this? I'm right now trying it with merge function, but appreciate any other ways to approach.

Upvotes: 4

Views: 60

Answers (3)

Randall Helms
Randall Helms

Reputation: 859

To add to @akrun's answer, if you want to replace the NA values with 0, you can do the following:

library(dplyr)
library(tidyr)

df1 %>%
complete(Class, month = min(month):max(month)) %>%
mutate(Value = ifelse(is.na(Value),0,Value))

Upvotes: 1

989
989

Reputation: 12935

In base R using merge (where df is your data):

res <- data.frame(Class=rep(levels(df$Class), each=12), value=NA, month=1:12)
merge(df, res, by = c("Class", "month"), all.y = TRUE)[,c(1,3,2)]

   # Class Value month
# 1      A    10     1
# 2      A    NA     2
# 3      A    12     3
# 4      A    NA     4
# 5      A    NA     5
# 6      A    NA     6
# 7      A    NA     7
# 8      A    NA     8
# 9      A    NA     9
# 10     A    NA    10
# 11     A    NA    11
# 12     A     9    12
# 13     B    11     1
# 14     B    NA     2
# 15     B    NA     3
# 16     B    NA     4
# 17     B    NA     5
# 18     B    NA     6
# 19     B    NA     7
# 20     B    10     8
# 21     B    NA     9
# 22     B    NA    10
# 23     B    NA    11
# 24     B    NA    12

df <- structure(list(Class = structure(c(1L, 1L, 1L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), Value = c(10L, 12L, 9L, 11L, 10L), month = c(1L, 
3L, 12L, 1L, 8L)), .Names = c("Class", "Value", "month"), class = "data.frame", row.names = c(NA, 
-5L))

Upvotes: 3

akrun
akrun

Reputation: 887571

We can use tidyverse

library(tidyverse)
df1 %>%
   complete(Class, month = min(month):max(month)) %>% 
   select_(.dots = names(df1)) %>%  #if we need to be in the same column order
   as.data.frame() #if needed to convert to 'data.frame'

Upvotes: 4

Related Questions