Reputation: 1265
I have the following data frame in R
df1 <- data.frame(
"ID" = c("A", "B", "A", "B"),
"Value" = c(1, 2, 5, 5),
"freq" = c(1, 3, 5, 3)
)
I wish to obtain the following data frame
Value freq ID
1 1 A
2 NA A
3 NA A
4 NA A
5 1 A
1 NA B
2 2 B
3 NA B
4 NA B
5 5 B
I have tried the following code
library(tidyverse)
df_new <- bind_cols(df1 %>%
select(Value, freq, ID) %>%
complete(., expand(.,
Value = min(df1$Value):max(df1$Value))),)
I am getting the following output
Value freq ID
<dbl> <dbl> <fct>
1 1 A
2 3 B
3 NA NA
4 NA NA
5 5 A
5 3 B
I request someone to help me.
Upvotes: 2
Views: 165
Reputation: 13135
Using tidyr::full_seq
we can find the full version of Value
but nesting(full_seq(Value,1)
will return an error:
Error:
by
can't contain join columnfull_seq(Value, 1)
which is missing from RHS
so we need to add a name, hence nesting(Value=full_seq(Value,1)
library(tidyr)
df1 %>% complete(ID, nesting(Value=full_seq(Value,1)))
# A tibble: 10 x 3
ID Value freq
<fct> <dbl> <dbl>
1 A 1. 1.
2 A 2. NA
3 A 3. NA
4 A 4. NA
5 A 5. 5.
6 B 1. NA
7 B 2. 3.
8 B 3. NA
9 B 4. NA
10 B 5. 3.
Upvotes: 4
Reputation: 18657
Would the following approach work for you?
with(data = df1,
expr = {
data.frame(Value = rep(wrapr::seqi(min(Value), max(Value)), length(unique(ID))),
ID = unique(ID))
}) %>%
left_join(y = df1,
by = c("ID" = "ID", "Value" = "Value")) %>%
arrange(ID, Value)
Value ID freq
1 1 A 1
2 2 A NA
3 3 A NA
4 4 A NA
5 5 A 5
6 1 B NA
7 2 B 3
8 3 B NA
9 4 B NA
10 5 B 3
ID
group takes values from 1 to 5. If this is the case, my approach would be to generate that reading unique combinations of both from the original data frame.freq
that may / may not be available for a given par ID-Value
. I would join that variable via left_join
(as you seem to like tidyverse
)
freq
variable with values 1,3,5 but then in the example you list 1,2,5? In my example, I took original freq
and left join it. You can modify it further using normal dplyr
pipeline, if this is something you intended to do.Upvotes: 1
Reputation: 33753
Using data.table
:
library(data.table)
setDT(df1)
setkey(df1, ID, Value)
df1[CJ(ID = c("A", "B"), Value = 1:5)]
ID Value freq
1: A 1 1
2: A 2 NA
3: A 3 NA
4: A 4 NA
5: A 5 5
6: B 1 NA
7: B 2 3
8: B 3 NA
9: B 4 NA
10: B 5 3
Upvotes: 2