TestingInProd
TestingInProd

Reputation: 379

Loop throug the data frame applying some function on each value in R

I am new to R . I have a data frame(usr.query) with structure as shown below

[This is my data frame[1]

Now I want to take text of each id and compare it to text of all the other id and and if there is a match, i want to append it to a new column say count of match.

A0008 with A0043,A0065,A0082,B0018,B0026
A0043 with A0008,A0065,A0082,B0018,B0026

Function to apply

count_match = length(intersect(unlist(strsplit(query1," ")),unlist(strsplit(query2," "))))

The query 1 here is text of A0008 and query 2 is text of A0043,A0065,A0082,B0018,B0026

I tried the suggested solution and here is the result. enter image description here

Upvotes: 0

Views: 55

Answers (1)

alistaire
alistaire

Reputation: 43334

No loops are necessary; you'll usually find that's the case in R, because it's really good at utilizing vectorized operations. In this case, you can get the necessary combinations with combn, and then make the match_count column by subsetting the original data.frame with the combinations of the new one, and testing for equality. Adding zero changes the values from Boolean to numeric (use as.integer, if you prefer).

# assemble sample data
df <- data.frame(id = 1:5, text = c('apple', 'mango', 'apple', 'apple', 'mango'))

# make combinations
df2 <- as.data.frame(t(combn(df$id, 2)))
# add names
names(df2) <- c('main_id', 'compared_to_id')
# test for match
df2$match_count <- (df[df2$main_id, 'text'] == df[df2$compared_to_id, 'text']) + 0

The result:

> df2
   main_id compared_to_id match_count
1        1              2           0
2        1              3           1
3        1              4           1
4        1              5           0
5        2              3           0
6        2              4           0
7        2              5           1
8        3              4           1
9        3              5           0
10       4              5           0

Upvotes: 2

Related Questions