Reputation: 375
I have three tables. One which contains the results. The other two contain notes about unique ids that appear in the results. I will try to explain the format a little bit.
RESULTS
UniqueID1 UniqueID2 [Columns containing the results from some experiment.]
NOTES1
UniqueID [Notes about this particular uniqued ID]
NOTES2
same as above.
So, I want every row in the results to contain the notes from the NOTES table. I now accomplish this iteratively; this process now takes longer than the experiment itself. I have considered using subsetting or lapply perhaps, but can't seem to find something fitting.
Some things to consider.
Regarding results: The two UniqueID columns in results are disjoint. They are not sets; they contain repetition. However, each pair of unique IDs occurs exactly one time in the results. table.
Regarding NOTES: The unique id column does not contain repetition.
The details I have given above are part of why I haven't just lazily subsetted the NOTES table and called cbind on results and the subset of NOTES.
I believe I have devoted a decent bit of thought to this; please don't downvote just because the answer is trivial to you. I haven't included code, because I want to know how accomplish this for the general case in R without iteration. If you think that it will help, I could include some code.
Sample Tables:
Results
"UniqueID_1" "UniqueID_2" "P_val" "Corr_Coef"
"M200000001" "M200000113" "0.530536974203384" "-0.115081242888051"
"M200000001" "M200000113" "0.530536974203384" "-0.115081242888051"
"M200000001" "M200000114" "0.252990706016934" "0.399292132865147"
"M200000001" "M200000115" "0.368094246871692" "0.16456647390621"
"M200000001" "M200000116" "0.78457345841227" "-0.0502948318806314"
"M200000001" "M200000117" "0.874022710686625" "-0.0291832590390869"
"M200000001" "M200000121" "0.199903265024637" "0.232729988792225"
"M200000001" "M200000122" "0.892630363051574" "-0.024845579436259"
"M200000001" "M200000126" "0.27862153489629" "-0.208127935743872"
NOTES
M200000002 22352 Vim
M200000004 20810 Srm
M200000005 17878 Myf6
M200000006 27061 Bcap31
M200000007 54204 1-Sep
M200000008 71995 Erv3
M200000009 15384 Hnrpab
M200000010 14453 Gas2
M200000011 12054 Bcl7b
M200000012 26572 Cops3
M200000014 17896 Myl4
M200000016 52469 Ccdc56
M200000017 19720 Trim27
M200000019 11434 Acr
Upvotes: 1
Views: 515
Reputation: 6290
merge(results, notes, by="UniqueID_1", all.x=TRUE)
or, for the case of two notes:
merge(merge(results, notes1, by="UniqueID_1", all.x=TRUE),
notes1, by="UniqueID_2", all.x=TRUE)
You haven't told us the names of your NAMES columns, so if the ID column in each is actually UniqueID
then you'd need to do something like this:
merge(merge(results, notes1, by.x="UniqueID_1", by.y="UniqueID", all.x=TRUE),
notes1, by.x="UniqueID_2", by.y="UniqueID", all.x=TRUE)
The notes from notes1
and notes2
will be in different columns.
Upvotes: 3
Reputation: 37784
It still isn't clear to me why subset
and cbind
aren't working for you; it would seem something like this would work fine.
cbind(Results, Notes1[Results$UniqueID_1,], Notes2[Results$UniqueID_2,])
This would require the rownames of the Notes tables to be the ID; perhaps that's the issue?
rownames(Notes1) <- Notes1$UniqueID
rownames(Notes2) <- Notes2$UniqueID
match
is also useful in cases like this; it's what merge
is using behind the scenes.
Upvotes: 0