Reputation: 285
I have data.frame object in the list and I intend to do setdiff for data.frame objects conditionally. I also come up very sketch function to do this task, but I got an error for taking complementary set of data.frame. In particular, I want to take corresponding data.frame depends on the condition. Can anyone propose me any idea to solve this issue efficiently ? How can I accomplish this task ?
mini example:
myList <- list(
saved = data.frame(from=c(3,33,54,91), to=c(23,42,71,107), label=c("a1","a4","a7","a11"), SC=c(22,6,13,7)),
droped = data.frame(from=c(25,33,47,74,91), to=c(29,42,51,81,107), label=c("a2","a4","a6","a8","a11"), SC=c(3,6,4,5,7))
)
based on input, I desire to implement this function (just sketch):
library(dplyr)
func <- function(list, type=c("Bio", "Tech")) {
type=match.arg(type)
res <- ifelse(type=="Bio",
res <- list[[1]],
res <- setdiff(list[[1]], list[[2]]))
return(res)
}
I got an error like this:
Error: not compatible: Factor levels not equal for column label
my desired output would be :
if type is "Bio" :
from to label SC
1 3 23 a1 22
2 33 42 a4 6
3 54 71 a7 13
4 91 107 a11 7
if type is "Tech" :
from to label SC
1 3 23 a1 22
3 54 71 a7 13
Can anyone point me out how to fix this problem ? How can I get my expected output more efficiently? Thanks a lot.
Upvotes: 0
Views: 200
Reputation: 7435
The issue is that the label
column in each of your data frames is a factor and not just characters. To get what you want:
myList <- list(
saved = data.frame(from=c(3,33,54,91), to=c(23,42,71,107), label=c("a1","a4","a7","a11"), SC=c(22,6,13,7), stringsAsFactors=FALSE),
droped = data.frame(from=c(25,33,47,74,91), to=c(29,42,51,81,107), label=c("a2","a4","a6","a8","a11"), SC=c(3,6,4,5,7), stringsAsFactors=FALSE)
)
func <- function(list, type=c("Bio", "Tech")) {
type=match.arg(type)
if(type=="Bio") list[[1]] else setdiff(list[[1]], list[[2]])
}
Notes:
Use StringsAsFactors=FALSE
in constructing your data frames.
The other issue has to do with your definition of func
. Using ifelse
on a scalar comparison of type
will only return you the first column for your result. So, use if-else
instead in your func
.
With this:
func(myList,"Bio")
## from to label SC
##1 3 23 a1 22
##2 33 42 a4 6
##3 54 71 a7 13
##4 91 107 a11 7
func(myList,"Tech")
## from to label SC
##1 3 23 a1 22
##2 54 71 a7 13
If you do want to keep the label
columns as factors, then you need to set the levels of these factors to be the union of the individual factor levels:
## This time with stringsAsFactors=TRUE
myList <- list(
saved = data.frame(from=c(3,33,54,91), to=c(23,42,71,107), label=c("a1","a4","a7","a11"), SC=c(22,6,13,7), stringsAsFactors=TRUE),
droped = data.frame(from=c(25,33,47,74,91), to=c(29,42,51,81,107), label=c("a2","a4","a6","a8","a11"), SC=c(3,6,4,5,7), stringsAsFactors=TRUE)
)
myLevels <- unique(c(levels(myList[[1]]$label),levels(myList[[2]]$label)))
##[1] "a1" "a11" "a4" "a7" "a2" "a6" "a8"
myList[[1]]$label <- factor(myList[[1]]$label,levels=myLevels)
myList[[2]]$label <- factor(myList[[2]]$label,levels=myLevels)
Then the above func
will work as before.
Upvotes: 1