Reputation: 137
Sorry for the title, hope it is not too misleading. I have the following dataframe df1:
id1 clas1 clas2 clas3
512 ns abx NA
512 ns or NA
512 abx dm sup
845 or NA NA
1265 dd ivf NA
1265 ns ivf pts
9453 col ns ivf
9453 abx ns or
95635 ns abx or
Then I have "df2" that has the following information (some of the values in df1$id1 are included in df2$id2 and viceversa) which is a column in another dataset or different length of the first one.
id2 clas0
102 ns
512 ns
915 ns
1265 ns
9453 ns
10485 ns
95639 ns
100348 ns
What I am trying to do is to count how many "id1" have a common value (i.e. "ns") with id2 in any of the clas columns (i.e. "ns").
So I have tried this:
x<-as.numeric(levels(factor(df2$id2)))
clas<-ls()
for(i in 1:x){
for(j in 1:length(df1$id1)){
if(df1$id1==i){clas[[i]]=append(clas[[i]],c(df1$clas1[j],df1$clas2[j],df1$clas3[j]))}
}
}
What I am trying to do here is to create a list including all the clas1, clas2 or clas3 when id1 is repeated so that I could then later see when the value in clas0 is included somewhere in the list? However I keep getting the following warning:
In if (id1$id1 == i) { ... :
the condition has length > 1 and only the first element will be used
I am stuck. Could someone point me in the right direction? Many thanks Marco
Upvotes: 0
Views: 40
Reputation: 132676
What I am trying to do is to count how many "id1" have a common value (i.e. "ns") with id2 in any of the clas columns (i.e. "ns").
df1 <- read.table(text="id1 clas1 clas2 clas3
512 ns abx NA
512 ns or NA
512 abx dm sup
845 or NA NA
1265 dd ivf NA
1265 ns ivf pts
9453 col ns ivf
9453 abx ns or
95635 ns abx or", header=TRUE)
df2 <- read.table(text=" id2 clas0
102 ns
512 ns
915 ns
1265 ns
9453 ns
10485 ns
95639 ns
100348 ns", header=TRUE)
df <- merge(df1, df2, by.x="id1", by.y="id2")
sum(apply(df$clas0 == df[, c("clas1", "clas2", "clas3")], 1, any, na.rm = TRUE))
#[1] 5
Upvotes: 1