Brant Mullinix
Brant Mullinix

Reputation: 137

How to use a look up table to create a new row in a data frame?

I have two data frames. One contains the data I am trying to clean/modify(df_x) and one is a lookup table(df_y). There is a column df_x$TEXT that contains a string like "Some text - with more" and the lookup table df_y looks like this:

SORT            ABB
--------------  ----
Some Text       ST

I want to see if a value in df_y$SORT is in the df_x$TEXT for every row of df_x. If there is a match then take the df_y$ABB value at that matched row and add it to a new column in df_x like df_x$TEXT_ABB.

For the information above the algorithm would see that "Some Text" is in "Some text - with more" (ignoring case) so it would add the value "ST" to the column df_x$TEXT_ABB.

I know I can use match and or a combination of sapply and grep to search if it exists but I can not figure how to do this AND grab the abbreviation I would like to map it back to a new column in the original dataframe.

Upvotes: 1

Views: 73

Answers (1)

jogo
jogo

Reputation: 12569

you can try this:

df_x <- data.frame(TEXT=c("Some Text 001", "other text", "Some Text 002"))
df_y <- read.table(header=TRUE, text=
'SORT            ABB
"Some Text"       ST
"Other Text"      OT')

L <- sapply(df_y$SORT, grep, x=df_x$TEXT, ignore.case=TRUE)
df_x$abb <- NA
for (l in 1:length(L)) if (length(L[[l]])!=0) df_x$abb[L[[l]]] <- as.character(df_y$ABB[l])

Upvotes: 1

Related Questions