Dinesh
Dinesh

Reputation: 663

Extracting specific rows from the data frame in R

I have the following like data in a tab-delimted text file named original

Name     Symbol       Value
abcd       A            56   
de45       C            67
ji98       H            90
k9ug       K            43
phzt       L            98
prex       P            21
kadf       T            32

Also I have list of selected Symbols stored in another tab delimited text file named duplicate

Symbol     Description
 K            Intel
 P            Diary
 C            Cape
 S            Sheath
 A            Aim

I want to extract the rows from original file which has same Symbol with duplicate. I want my output like the following:

Name     Symbol       Value
abcd       A            56   
de45       C            67
k9ug       K            43
prex       P            21

I tried using the following code but some how I could not get any results or only the row of A. Here is the code which I have used

result <- original[original$Symbol %in% duplicate$Symbol,]

Could anyone please help me.

Upvotes: 0

Views: 4922

Answers (1)

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193497

This can be done with a simple merge:

merge(original, duplicate, by.x="Symbol", by.y="symbol")
#   Symbol Name Value Description
# 1      A abcd    56         Aim
# 2      C de45    67        Cape
# 3      K k9ug    43       Intel
# 4      P prex    21       Diary

You can manually drop the Description column before or after merging if it is not relevant.

Also, I don't know if this is a problem with the question as posted or if it is a problem with your code, but:

original[original$Symbol %in% duplicate$symbol, ]
#   Name Symbol Value
# 1 abcd      A    56
# 2 de45      C    67
# 4 k9ug      K    43
# 6 prex      P    21

Of course, you have to spell original correctly, which you did not!

Assumptions

  1. The correct capitalization of the word "symbol" in names(original) shows up with an upper-case S (Symbol).
  2. The correct capitalization of the word "symbol" in names(duplicate) shows up with a lower-case s (symbol).

If both are capitalized, then you can use either of the following solutions:

merge(original, duplicate)
original[original$Symbol %in% duplicate$Symbol, ]

Upvotes: 8

Related Questions