jasp singh
jasp singh

Reputation: 75

How to compare pair of two columns with another pair of column of a dataframe in r

I have a dataframe below.I want to compare pairs of two columns with another pairs of two columns. Every time The comparison of pairs of column should be based on comparing the entries of columns 1:2 with the entries of column 2:1. And where these two column pair is being matched then i want the frequency count to be added with that pair of column.

z <- c(3,3,2)
y <- c(1,2,3)
x <- data.frame(y,z)
library(plyr)
fr <- count(x[,1:2])
fr
# The matched pair of 1:2 with 2:1
fr[3,1:2] == fr[2,2:1]

My desired output is the dataframe that contains the sum of frequency count of the matched pair.

  y z freq
1 1 3    1
2 2 3    2

Upvotes: 1

Views: 631

Answers (1)

akrun
akrun

Reputation: 886948

We can do this with base R. We transform the dataset by changing the 'x' column with the minimum value of 'y' and 'z' for each row (using pmin), 'z' with the maximum value of 'y' and 'z' for each row (using pmax), create a new column of 'freq' with 1 as value. Then, use xtabs to get the sum of the 'freq' by 'x' and 'y' (by default, xtabs gets the sum), and convert to data.frame (as.data.frame).

 as.data.frame(xtabs(freq~., transform(x, y= pmin(y,z),
               z= pmax(y,z), freq=1)))
 #  y z Freq
 #1 1 3    1
 #2 2 3    2

Or another option would to loop along the rows with apply using MARGIN=1, sort the elements and aggregate to get the sum grouped by 'y' and 'z'

x[] <- t(apply(x, 1, sort))
aggregate(Freq~., transform(x, Freq=1), sum)
#    y z Freq
#1 1 3    1
#2 2 3    2

Upvotes: 2

Related Questions