Reputation: 521
I want to kind of transform a dataframe from long into wide format what one little extra task. I want to also merge it into another dataframe.
So this is my original df and I can transform it into a wide format.
V1 V2 V3
1 A IIA 1
2 A S1MU 5
3 A AMU 3
4 C IIU 6
5 C IMA 7
6 A S2FU 8
7 B S1MU 9
8 A IIU 13
9 B S1II 7
10 C S1II 9
11 D IMA 4
12 A S1II 3
reshape(testframe, idvar = "V1", timevar = "V2", direction = "wide")
V1 V3.IIA V3.S1MU V3.AMU V3.IIU V3.IMA V3.S2FU V3.S1II
1 A 1 5 3 13 NA 8 3
4 C NA NA NA 6 7 NA 9
7 B NA 9 NA NA NA NA 7
11 D NA NA NA NA 4 NA NA
And now I want to add that transformed dataframe into this dataframe:
Code IIA S1MU AMU IIU IMA S2FU S1II IFA BIA
1 A NA NA NA NA NA NA NA NA NA
2 B NA NA NA NA NA NA NA NA NA
3 C NA NA NA NA NA NA NA NA NA
4 D NA NA NA NA NA NA NA NA NA
It has more columns than needed for this example, in reality I have much more and the last 2 columns will also be filled. So I have to match the right code and the right column name.
Any ideas how I could do that?
testframe:
structure(list(V1 = c("A", "A", "A", "C", "C", "A", "B", "A",
"B", "C", "D", "A"), V2 = c("IIA", "S1MU", "AMU", "IIU", "IMA",
"S2FU", "S1MU", "IIU", "S1II", "S1II", "IMA", "S1II"), V3 = c(1,
5, 3, 6, 7, 8, 9, 13, 7, 9, 4, 3)), row.names = c(NA, -12L), class = "data.frame")
newframe:
structure(list(Code = c("A", "B", "C", "D"), IIA = c(NA, NA,
NA, NA), S1MU = c(NA, NA, NA, NA), AMU = c(NA, NA, NA, NA), IIU = c(NA,
NA, NA, NA), IMA = c(NA, NA, NA, NA), S2FU = c(NA, NA, NA, NA
), S1II = c(NA, NA, NA, NA), IFA = c(NA, NA, NA, NA), BIA = c(NA,
NA, NA, NA)), row.names = c(NA, -4L), class = "data.frame")
Upvotes: 1
Views: 55
Reputation: 887691
We can use dcast
from either reshape2
or data.table
. If we need the other missing columns from 'newframe', just assign those column names not found in the output but only in 'newframe' to NA
and then change the column order to as in 'newframe'
library(data.table)
out <- dcast(setDT(df1), V1 ~ V2, value.var = 'V3')
setnames(out, 'V1', 'Code')
out[, setdiff(names(newframe), names(out)) := NA_real_]
setcolorder(out, names(newframe))
-output
> out
Code IIA S1MU AMU IIU IMA S2FU S1II IFA BIA
1: A 1 5 3 13 NA 8 3 NA NA
2: B NA 9 NA NA NA NA 7 NA NA
3: C NA NA NA 6 7 NA 9 NA NA
4: D NA NA NA NA 4 NA NA NA NA
Or if we want to use base R
based on indexing, create a row names column and use rownames/column names for subsetting
out <- xtabs(V3 ~ V1 + V2, df1)
out[out == 0] <- NA
tmp <- as.matrix(`row.names<-`(newframe[-1], newframe$Code))
tmp[row.names(out), colnames(out)] <- out
-output
> tmp
IIA S1MU AMU IIU IMA S2FU S1II IFA BIA
A 1 5 3 13 NA 8 3 NA NA
B NA 9 NA NA NA NA 7 NA NA
C NA NA NA 6 7 NA 9 NA NA
D NA NA NA NA 4 NA NA NA NA
Upvotes: 2
Reputation: 79318
You could use tidyverse
:
library(tidyverse)
testframe %>%
rename(Code = V1) %>%
pivot_wider(Code, V2, values_from = V3) %>%
coalesce(newframe)
# A tibble: 4 x 10
Code IIA S1MU AMU IIU IMA S2FU S1II IFA BIA
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <lgl> <lgl>
1 A 1 5 3 13 NA 8 3 NA NA
2 C NA NA NA 6 7 NA 9 NA NA
3 B NA 9 NA NA NA NA 7 NA NA
4 D NA NA NA NA 4 NA NA NA NA
Upvotes: 3