NWdev
NWdev

Reputation: 493

Add columns from one R data table to another

Another R question. Have looked through the data.table vignettes and seen solutions like these:

But unfortunately while they're close, I'm somehow missing something in my understanding.

My initial data tables include one that includes results and another with standards. Several columns are common between the two tables. Here's a sample (more columns exist for both tables, but they are not common between the two).

Results
ID    Region   Locale    Medium    Name          Method
3324   Agate    Zone C    water     Cadmium        Z
2432   Gneiss   Zone B    air       Calcium        R
2433   Agate    Zone A    water     Molybdenum     Q
78882  Agate    Zone D    water     Iron           M

Standards
ID   Region   Locale   Medium     Name    CoeffA    CoeffB
3214  Agate   Zone A    water     Cadmium  -.243    1.43
3324  Agate   Zone C    water     Cadmium  -.243    1.43
2432  Gneiss  Zone B    water     Calcium  .432     0.44
78882 Agate   Zone D    water     Iron     1.475    0

There are many more results than standards and some results have no standards.

What I'd like to do is add the mathematical coefficient values of the standards table to the results table as new columns (C-a and C-b). Ultimately I'll use these to calculate comparative standard values.

Results
ID    Region   Locale    Medium    Name          Method      C-a         C-b
3324   Agate    Zone C    water     Cadmium      Z           -.243      1.43
2432   Gneiss   Zone B    air       Calcium      R           .432       0.44
2433   Agate    Zone A    water     Molybdenum   Q           NA         NA
78882  Agate    Zone D    water     Iron         M           1.475       0

I've tried the following without success:

I would have thought that one of these syntaxes would definitely have yielded coefficient columns with values other than NA.

Any suggestions on where I should look or what I'm missing?

Thanks in advance for your kind assistance.

Upvotes: 3

Views: 13763

Answers (2)

Ashvin Meena
Ashvin Meena

Reputation: 309

Try this, you can perform it without setkey as follows

require(data.table)
newResults <- merge(x = Results, y = Standards, by = "ID", all.x = TRUE)
setnames(newResults,"CoeffA","C-a")
setnames(newResults,"CoeffB","C-b")

newResults
ID     Region   Locale    Medium    Name         Method      C-a        C-b
2432   Gneiss   Zone B    air       Calcium      R           .432       0.44
2433   Agate    Zone A    water     Molybdenum   Q           NA         NA
3324   Agate    Zone C    water     Cadmium      Z           -.243      1.43
78882  Agate    Zone D    water     Iron         M           1.475      0

If you don't want NAs:

newResults[is.na(newResults)] <- 0   #replace NA with Zero
newResults[is.na(newResults)] <- "No value available" #replace NA with Text 

Upvotes: 4

Chris
Chris

Reputation: 6372

First off, setkey can't be used for multiple variables, you need to use setkeyv instead.

setkeyv(Results,c("ID","Region","Locale","Medium"))
setkeyv(Standards,c("ID","Region","Locale","Medium"))

Then:

JoinedDT <- merge(Results,Standards, all.x = TRUE)

This will give an NA in any results row that does not have a Standards Row. If there are multiple Standards rows for one Results row, you will get two rows in your resulting data table.

To set NA to 0:

JoinedDT[is.na(JoinedDT$CoeffA),CoeffA:= 0]
JoinedDT[is.na(JoinedDT$CoeffB),CoeffB:= 0]

Upvotes: 1

Related Questions