Reputation: 71
I am trying to wired merge. I have a dataset df
and correspondent ID list ID
ID <- data.frame(Alphabet = c("A", "A","A","B", "B", "C"),
Value = c(101,102, 103,201,202,301))
df <- data.frame(Name = c("A", "A","B", "C"))
I want to merge/ assign IDs to df
And get a df looks like
Name ID1 ID2 ID3
A 101 102 103
A 101 102 103
B 201 202
C 301
Upvotes: 0
Views: 88
Reputation: 42544
For the sake of completeness, here is also a solution using dcast()
from the data.table
package to reshape from long to wide format and a right join:
library(data.table)
# coerce to data.table
setDT(D)[
# reshape from long to wide, thereby creating column names
, dcast(.SD, Alphabet ~ rowid(Alphabet, prefix = "ID"))][
# rename column
, setnames(.SD, "Alphabet", "Name")][
# right join with df to repeat rows
setDT(df), on = "Name"]
Name ID1 ID2 ID3 1: A 101 102 103 2: A 101 102 103 3: B 201 202 NA 4: C 301 NA NA
In case NA
must not be shown, the output needs to be converted to type character:
setDT(D)[, dcast(.SD, Alphabet ~ rowid(Alphabet, prefix = "ID"), as.character, fill = "")][
, setnames(.SD, "Alphabet", "Name")][
setDT(df), on = "Name"]
Name ID1 ID2 ID3 1: A 101 102 103 2: A 101 102 103 3: B 201 202 4: C 301
Upvotes: 0
Reputation: 323276
Try this ? Notice the missing value using NA
is better than blank ~
If do want ''
rather than NA
just using outdf[is.na(outdf)]=''
library(dplyr)
ID=ID%>%group_by(Alphabet)%>%mutate(ID=row_number())
library(reshape2)
DF=as.data.frame(acast(ID, Alphabet~ID, value.var="Value"))
DF$Name=row.names(DF)
merge(df,DF,by='Name')
Name 1 2 3
1 A 101 102 103
2 A 101 102 103
3 B 201 202 NA
4 C 301 NA NA
or using tidyr
(Recommend~ Cause you are working with data.frame
)
library(dplyr)
library(tidyr)
ID=ID%>%group_by(Alphabet)%>%mutate(id=row_number())
DF=spread(ID, id,Value)
merge(df,DF,by.x='Name',by.y='Alphabet')
Name 1 2 3
1 A 101 102 103
2 A 101 102 103
3 B 201 202 NA
4 C 301 NA NA
Upvotes: 2
Reputation: 1975
I would address this problem by preparing a list that includes the rows of your final data frame and then 'rbinding' them together. The only trick is to count the max length of your rows and add NAs accordingly. This should work.
ID <- data.frame(Alphabet = c("A", "A","A","B", "B", "C"),
Value = c(101,102, 103,201,202,301))
df <- data.frame(Name = c("A", "A","B", "C"))
tmp <- lapply(df$Name, (function(id){
ID[ID$Alphabet == id, ]$Value
}))
max.el <- max(sapply(tmp, length))
out.df <- do.call(rbind, lapply(tmp, (function(el){
len.na <- max.el - length(el)
c(el, rep(NA, len.na))
})))
print(out.df, na.print = "")
This is the result
[,1] [,2] [,3]
[1,] 101 102 103
[2,] 101 102 103
[3,] 201 202
[4,] 301
If showing NAs is not a problem, then
colnames(out.df) <- paste("ID", c(1:max.el), sep = "")
out.df <- cbind(df, out.df)
out.df
Name ID1 ID2 ID3
1 A 101 102 103
2 A 101 102 103
3 B 201 202 NA
4 C 301 NA NA
Upvotes: 1