PitterJe
PitterJe

Reputation: 216

Check through the use of a list if a term exist

I have a dataframe like this:

df  <-data.frame(id = c(1,2,3), stock_1 = c("Google","Microsoft","Yahoo"), stock_2 = c("Yahoo","Gg","NA"))

with this I know the following are in the same group:

mylist <- c("Google", "Gg")

Having the previous list how is it possible to run the mylist through all rows and check if exist or not using 1 or 0 respectively. If the value more than one times in the same row it takes again the 1.

Example of output

df  <-data.frame(id = c(1,2,3), stock_1 = c("Google","Microsoft","Yahoo"), stock_2 = c("Yahoo","Gg","NA"), mylist = c(1,1,0))

Upvotes: 0

Views: 32

Answers (2)

Sotos
Sotos

Reputation: 51592

Here is a Vectorized way using do.call to paste the columns and then using grepl to detect the words in your mylist, i.e.

as.integer(grepl(paste(mylist, collapse = '|'), do.call(paste, df[-1])))
#[1] 1 1 0

Upvotes: 0

bouncyball
bouncyball

Reputation: 10771

We can use the apply function to iterate over the rows of df:

apply(df, 1, function(x) max(x %in% mylist))
# 1 1 0

We can store the result of this function in a new column:

df$mylist <- apply(df, 1, function(x) max(x %in% mylist))

#   id   stock_1 stock_2 mylist
# 1  1    Google   Yahoo      1
# 2  2 Microsoft      Gg      1
# 3  3     Yahoo      NA      0

Upvotes: 2

Related Questions