Arun
Arun

Reputation: 625

How to check whether a variable is numeric for a vector in R?

I have two questions.

  for (k in 1:iterations) {
      corr <- cor(df2_prod[,k], df2_qa[,k])

      ifelse(is.numeric(corr), next,
              ifelse((all(df2_prod[,k] == df2_qa[,k])) ), (corr <- 1), (corr <- 0))

      correlation[k,] <- rbind(names(df2_prod[k]), corr)

    }

This is my requirement - I want to calculate correlation for variables in a loop using the code corr <- cor(df2_prod[,k], df2_qa[,k]) If i receive a correlation value in number, I have to keep the value as it is.

Some time it happens that if two columns have the same values, i receive "NA" as output for the vector "corr".

x   y
1   1
1   1
1   1
1   1
1   1

corr
     [,1]
[1,]   NA

I am trying to handle in such a way that if "NA" is received, i will replace the values with "1" or "0".

My questions are:

  1. When I check the class of "corr" vector, I am getting it as "matrix". I want to check whether that is a number or not. Is there any other way other than checking is.numeric(corr)

    > class(corr)
    [1] "matrix"
    
  2. I want to check if two columns has same value or not. Something like the code below. If it returns true, I want to proceed. But the way I have put the code in the loop is wrong. Could you please help me how this can be improved: ((all(df2_prod[,k] == df2_qa[,k]))

Is there any effective way to do this?

I sincerely apologize the readers for the poorly framed question / logic. If you can show me pointers which can improve the code, I would be really thankful to you.

Upvotes: 0

Views: 458

Answers (2)

Cath
Cath

Reputation: 24074

An example to explain how the cor function works:

set.seed(123)
df1 <- data.frame(v1=1:10, v2=rnorm(10), v3=rnorm(10), v4=rnorm(10))
df2 <- data.frame(w1=rnorm(10), w2=1:10, w3=rnorm(10))

Here, the first variable of df1 is equal to the second variable of df2. Function cor directly applied on the first 3 variables of each data.frame gives:

cor(df1[, 1:3], df2[, 1:3])
#           w1         w2         w3
#v1 -0.4603659  1.0000000  0.1078796
#v2  0.6730196 -0.2602059 -0.3486367
#v3  0.2713188 -0.3749826 -0.2520174

As you can notice, the correlation coefficient between w2 and v1 is 1, not NA.

So, in your case, cor(df2_prod[, 1:k], df2_qa[, 1:k]) should provide you the desired output.

Upvotes: 1

Tonio Liebrand
Tonio Liebrand

Reputation: 17689

1. You basically want to avoide NAs, right? So you could check the result with is.na().

a <- rep(1, 5)
b <- rep(1, 5)
if(is.na(cor(a, b))) cor.value <- 1

2.You could count how many times the element of a is equal to the element of b with sum(a==b) and check whether this amount is equal to the amount of elements in a (or b) --> length(a)

if(sum(a==b) == length(a)) cor.value <- 1

Upvotes: 1

Related Questions