Reputation: 28149
I have 2 data frames I'm trying to merge. One is a list of id numbers and % completion for each id number. The second is a larger data set I would like to attachd the % completion variable to.
Df1 looks like this:
> rowIdd <- data.frame(id=seq(1,1000000,1))
> rowIdd$adQuery1 <- factor(paste(proj3[1:1000000,"adid"], " | ", proj3[1:1000000,"queryid"], sep=""))
> head(rowIdd)
id adQuery1
1 1 9027213 | 5
2 2 9027213 | 5
3 3 9027213 | 1
4 4 9027213 | 5
5 5 9027213 | 5
6 6 9027213 | 5
> str(rowIdd)
'data.frame': 1000000 obs. of 2 variables:
$ id : num 1 2 3 4 5 6 7 8 9 10 ...
$ adQuery1: Factor w/ 717927 levels "1000467 | 17284",..: 704056 704056 703739 704056 704056 704056 704110 704056 704056 704056 ...
Df2 looks like:
> rowIdd2 <- data.frame(adQuery1 = names(sumclick))
> rowIdd2$pCTR <- sumclick / sumimpress
> str(rowIdd2)
'data.frame': 717927 obs. of 2 variables:
$ adQuery1: Factor w/ 717927 levels "1000467 | 17284",..: 1 2 3 4 5 6 7 8 9 10 ...
$ pCTR : num [1:717927(1d)] 0 0 0 0 0 0 0 0 0 0 ...
..- attr(*, "dimnames")=List of 1
.. ..$ : chr "1000467 | 17284" "1000467 | 34711" "1000471 | 173750" "1000479 | 1924662" ...
> head(rowIdd2)
adQuery1 pCTR
1 1000467 | 17284 0
2 1000467 | 34711 0
3 1000471 | 173750 0
4 1000479 | 1924662 0
5 1000479 | 869 0
6 1000515 | 12208696 0
When I try to merge the 2 using this code:
> rowIdd3 <- merge(rowIdd, rowIdd2, by="adQuery1", sort=F,all.x=TRUE)
> nrow(rowIdd3)
[1] 1000000
> head(rowIdd3)
adQuery1 id pCTR
1 9027213 | 5 1 0.04567665
2 9027213 | 5 2 0.04567665
3 9027213 | 5 669222 0.04567665
4 9027213 | 5 4 0.04567665
5 9027213 | 5 5 0.04567665
6 9027213 | 5 6 0.04567665
This is obviously wrong just looking at the 3rd row.
I need the final merged data frame to be in the EXACT same order as the first data frame as it needs to be cbind'd to another data frame.
Any suggestions?
Upvotes: 0
Views: 171