Reputation: 13
below is some dummy data.
Suppose I have a dataframe
df = data.frame(source = c("X1", "X2", "X3", "X4", "X5", "X6", "X7", "X8", "X9", "X10",
"X11", "X12", "X13", "X14", "X15", "X16", "X17", "X18", "X19", "X110"),
Destination = c("X3","X5","X17", "X20", "X20","X1", "X2", "X3", "X7", "X10",
"X13","X15","X7", "X1", "X20","X17", "X2", "X3", "X7", "X10"),
weight = seq(1,1.95,by=0.05))
I then have some odds ratios for Destinations X1:X3
and there respective standard deviations and I want to randomly sample 10 times from each odds ratio and its corresponding standard deviation
OR_dat <- c(1.55,1.39,1.77)
sds <- c(0.2925175, 0.4775346, 0.1603566)
n <- 10
normv <- function( n , mean , sd ){
out <- rnorm( n*length(mean) , mean = mean , sd = sd )
return( matrix( out , ncol = n , byrow = FALSE ))
}
RR_neighbour_1 <- data.frame(t(normv(n, OR_dat , sds )))
colnames(RR_neighbour_1) <- c("X1", "X2", "X3")
What I am really looking for is merging the matrix into the data.frame by looking at the value in the column titled "Destination"
, matching it up with the column name of the matrix titled RR_neighbour_1
and then creating additional rows to input the distribution. The output should then end up like the following:
Upvotes: 1
Views: 1018
Reputation: 455
One possibility: if you're happy to use the dplyr
package then it includes SQL-style join functions. You probably want the left_join
function from that package, which lets you map columns using the by
parameter. That's an easy way to join two table-like structures.
Upvotes: 0
Reputation: 1664
What you actually want to do is to merge two data.frames by Destination
. So you first need to get your second data.frame (RR_neighbour_1
) into long format (so the same format as the first one, where the different destinations are rows not columns). Then you can simply merge the data.frames with the merge
function. Argument all=T
will ensure that the rows for the repeated destinations will be added.
RR_neighbour_1 <- reshape(RR_neighbour_1,dir="long",varying = list(1:3),
timevar = "Destination",
times = colnames(RR_neighbour_1),
v.names = "RR_neighbour_1")
merge(df, RR_neighbour_1[,-3], all=T)
Upvotes: 1