M.anna
M.anna

Reputation: 3

How to multiply columns from two different data.table by a matching condition?

I have two data.tables about the same population with unique IDs, one with no missing values, while the other one has several missing values. I would like to multiply ratio column of DT1 by number column of DT2.

**DT1**              
*ID__ratio*            
1__ 0.3              
2__ 0.2          
3__ 0.4          
4__ 0.1           
5__ 0.7           
6__ 0.3           
7__ 0.5           
8__ 0.9           
9__ 0.1           
10_ 0.4           

**DT2**  
*ID__ number*  
1__  NA   
2__ NA   
3__ 488  
4__ NA  
5__ NA  
6__ 600  
7__ 789   
8__ 503    
9__ NA  
10__NA    

I have tried by match and ifelse:

result <- DT1$ratio[match(DT1$ID, DT2$ID)] * DT2$number
result <- ifelse(DT1$ID==DT2$ID, DT1$ratio * DT2$number, NA)

It runs, but I got the following warnings:

longer object length is not a multiple of shorter object length

for both "match" and the "ifelse" solution.

Both had wrong results: there are results for IDs where are missing data for number in DT2, and there are NA where are data available for number in DT2.

If I try to merge DT1 and DT2 by the following code, it works:

merged <- merge(DT1, DT2, key = "ID")

Upvotes: 0

Views: 499

Answers (1)

Khaynes
Khaynes

Reputation: 1986

If you just want to return the multiplied vector, like so:

require(data.table)

DT1 <- structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), ratio = c(0.3, 
0.2, 0.4, 0.1, 0.7, 0.3, 0.5, 0.9, 0.1, 0.4)), class = "data.frame", row.names = c(NA, 
-10L))

DT2 <- structure(list(ID = 1:10, number = c(NA, NA, 488L, NA, NA, 600L, 
789L, 503L, NA, NA)), class = "data.frame", row.names = c(NA, 
-10L))

setDT(DT1)
setDT(DT2)

DT1$ratio[match(DT2$ID, DT1$ID)] * DT2$number

Note the order of the match.

Upvotes: 2

Related Questions