screechOwl
screechOwl

Reputation: 28169

R concatenating two factors

This is making me feel dumb, but I am trying to produce a single vector/df/list/etc (anything but a matrix) concatenating two factors. Here's the scenario. I have a 100k line dataset. I used the top half to predict the bottom half and vice versa using knn. So now I have 2 objects created by knn predict().

> head(pred11)
[1] 0 0 0 0 0 0
Levels: 0 1
> head(pred12)
[1] 0 1 1 0 0 0
Levels: 0 1
> class(pred11)
[1] "factor"
> class(pred12)
[1] "factor"

Here's where my problem starts:

> pred13 <- rbind(pred11, pred12)
> class(pred13)
[1] "matrix"

There are 2 problems. First it changes the 0's and 1's to 1's and 2's and second it seems to create a huge matrix that's eats all my memory. I've tried messing with as.numeric(), data.frame(), etc, but can't get it to just combine the 2 50k factors into 1 100k one. Any suggestions?

Upvotes: 13

Views: 18889

Answers (2)

Tommy
Tommy

Reputation: 40861

@James presented one way, I'll chip in with another (shorter):

set.seed(42)
x1 <- factor(sample(0:1,10,replace=T))
x2 <- factor(sample(0:1,10,replace=T))

unlist(list(x1,x2))
# [1] 1 1 0 1 1 1 1 0 1 1 0 1 1 0 0 1 1 0 0 1
#Levels: 0 1

...This might seem a bit like magic, but unlist has special support for factors for this particular purpose! All elements in the list must be factors for this to work.

Upvotes: 29

James
James

Reputation: 66844

rbind will create 2 x 50000 matrix in your case which isn't what you want. c is the correct function to combine 2 vectors in a single longer vector. When you use rbind or c on a factor, it will use the underlying integers that map to the levels. In general you need to combine as a character before refactoring:

x1 <- factor(sample(0:1,10,replace=T))
x2 <- factor(sample(0:1,10,replace=T))

factor(c(as.character(x1),as.character(x2)))
 [1] 1 1 1 0 1 1 0 1 0 0 0 1 1 1 1 1 1 0 0 0
Levels: 0 1

Upvotes: 18

Related Questions