Reputation: 13
I am using R to try to write the results of a for loop to a column I have created in my data frame. My additional column is hetCSF in my data frame Genotypes and I am trying to assign it a default value of 0. The for loop should compare 2 columns from my data frame at each of 1036 rows, return a value of 1 if the two data entries are the same and 0 if they are not. Could someone please look at this code and tell me what I am doing wrong?
Heterozygosity <- function (Genotypes$CSF1PO, Genotypes$CSF1PO.1){
Genotypes$hetCSF <- 0 #gives Genotypes$hetCSF a default value of 0 which corresponds to heterozygous
for (i in 1:nrow(Genotypes)){ #loops the following across all rows
Genotypes$hetCSF <- as.numeric(identical(Genotypes[i, "CSF1PO"], Genotypes[i, "CSF1PO.1"])) #decides if the 2 columns have the same value and are therefore homozygous, returns 1 for homozygous in new column homCSF.G
}
}
Currently when I run this it tells me I have an unexpected '$' in "Heterozygosity <- function Genotypes$" and an unexpected '}' in "}"
. Thanks so much for your help. I am very new to R so I apologize if this is a very elementary question.
Upvotes: 1
Views: 140
Reputation: 59345
Another way to do the same thing:
Genotypes$hetSCF <- with(Genotypes, as.integer(CSF1PO == CSF1PO.1))
or, a little longer:
Genotypes$hetSCF <- as.integer(Genotypes$CSF1PO == Genotypes$CSF1PO.1)
FWIW: Since you're new to R, the reason this works is that R is vectorized, which means that (almost) everything is treated as a vector. So you almost never have to loop through rows. Consequently, (x==y)
will return a vector of T or F values depending on whether the corresponding elements in x and y are equal. The function as.integer(...)
(in this case) takes a vector argument and returns a new vector with T converted to 1 and F converted to 0.
Upvotes: 1
Reputation: 81683
An easier approach is
Genotypes <- transform(Genotypes, hetCSF = as.integer(CSF1PO == CSF1PO.1))
Upvotes: 3