Reputation: 625
I have two questions.
for (k in 1:iterations) {
corr <- cor(df2_prod[,k], df2_qa[,k])
ifelse(is.numeric(corr), next,
ifelse((all(df2_prod[,k] == df2_qa[,k])) ), (corr <- 1), (corr <- 0))
correlation[k,] <- rbind(names(df2_prod[k]), corr)
}
This is my requirement - I want to calculate correlation for variables in a loop using the code corr <- cor(df2_prod[,k], df2_qa[,k])
If i receive a correlation value in number, I have to keep the value as it is.
Some time it happens that if two columns have the same values, i receive "NA" as output for the vector "corr".
x y
1 1
1 1
1 1
1 1
1 1
corr
[,1]
[1,] NA
I am trying to handle in such a way that if "NA" is received, i will replace the values with "1" or "0".
My questions are:
When I check the class of "corr" vector, I am getting it as "matrix". I want to check whether that is a number or not. Is there any other way other than checking is.numeric(corr)
> class(corr)
[1] "matrix"
I want to check if two columns has same value or not. Something like the code below. If it returns true, I want to proceed. But the way I have put the code in the loop is wrong. Could you please help me how this can be improved:
((all(df2_prod[,k] == df2_qa[,k]))
Is there any effective way to do this?
I sincerely apologize the readers for the poorly framed question / logic. If you can show me pointers which can improve the code, I would be really thankful to you.
Upvotes: 0
Views: 458
Reputation: 24074
An example to explain how the cor
function works:
set.seed(123)
df1 <- data.frame(v1=1:10, v2=rnorm(10), v3=rnorm(10), v4=rnorm(10))
df2 <- data.frame(w1=rnorm(10), w2=1:10, w3=rnorm(10))
Here, the first variable of df1
is equal to the second variable of df2
. Function cor
directly applied on the first 3 variables of each data.frame gives:
cor(df1[, 1:3], df2[, 1:3])
# w1 w2 w3
#v1 -0.4603659 1.0000000 0.1078796
#v2 0.6730196 -0.2602059 -0.3486367
#v3 0.2713188 -0.3749826 -0.2520174
As you can notice, the correlation coefficient between w2
and v1
is 1
, not NA
.
So, in your case, cor(df2_prod[, 1:k], df2_qa[, 1:k])
should provide you the desired output.
Upvotes: 1
Reputation: 17689
1. You basically want to avoide NAs, right? So you could check the result with is.na().
a <- rep(1, 5)
b <- rep(1, 5)
if(is.na(cor(a, b))) cor.value <- 1
2.You could count how many times the element of a is equal to the element of b with sum(a==b) and check whether this amount is equal to the amount of elements in a (or b) --> length(a)
if(sum(a==b) == length(a)) cor.value <- 1
Upvotes: 1