Kevin E Dow
Kevin E Dow

Reputation: 11

sum mismatchs in column to column comparison

I am quite new to R programming, and am having some difficulty with ANOTHER step of my project. I am not even sure at this point if I am asking the question correctly. I have a dataframe of actual and predicted values:

actual  predicted.1 predicted.2 predicted.3 predicted.4
a   a   a   a   a
a   a   a   b   b
b   b   a   b   b
b   a   b   b   c
c   c   c   c   c
c   d   c   c   d
d   d   d   c   d
d   d   d   d   a

The issue that I am having is that I need to create a vector of mismatches between the actual value and each of the four predicted values. This should result in a single vector: c(2,1,2,4)

I am trying to use a boolean mask to sum over the TRUE values...but something is not working right. I need to do this sum for each of the four predicted values to actual value comparisons.

discordant_sums(df[,seq(1,ncol(df),2)]!=,df[,seq(2,ncol(df),2)])

Any suggestions would be greatly appreciated.

Upvotes: 1

Views: 44

Answers (2)

akrun
akrun

Reputation: 887158

We can replicate the first column to make the lengths equal between the comparison objects and do the colSums

as.vector(colSums(df[,1][row(df[-1])] != df[-1]))
#[1] 2 1 2 4

data

df <- structure(list(actual = c("a", "a", "b", "b", "c", "c", "d", 
"d"), predicted.1 = c("a", "a", "b", "a", "c", "d", "d", "d"), 
    predicted.2 = c("a", "a", "a", "b", "c", "c", "d", "d"), 
    predicted.3 = c("a", "b", "b", "b", "c", "c", "c", "d"), 
    predicted.4 = c("a", "b", "b", "c", "c", "d", "d", "a")),
  .Names = c("actual", 
"predicted.1", "predicted.2", "predicted.3", "predicted.4"), 
  class = "data.frame", row.names = c(NA, 
-8L))

Upvotes: 1

MKR
MKR

Reputation: 20095

You can use apply to compare values in 1st column with values in each of all other columns.

apply(df[-1], 2, function(x)sum(df[1]!=x))

# predicted.1 predicted.2 predicted.3 predicted.4 
# 2           1           2           4 

Data:

df <- read.table(text = 
"actual  predicted.1 predicted.2 predicted.3 predicted.4
a   a   a   a   a
a   a   a   b   b
b   b   a   b   b
b   a   b   b   c
c   c   c   c   c
c   d   c   c   d
d   d   d   c   d
d   d   d   d   a",
header = TRUE, stringsAsFactors = FALSE)

Upvotes: 1

Related Questions