Reputation: 285
I have a huge data frame (call it huge
) I would like to split up in two by row number. Though, I notice that the way I'd do it makes the resulting subsets large factors instead of data frames.
list1 <- huge[c(1:8175),]
list2 <- huge[c(8176:nrow(huge),]
class(list1)
[1] "factor"
Can someone explain to me why it is like that, and how do I prevent that?
Upvotes: 1
Views: 129
Reputation: 39154
It is likely that you subset a one-column data frame. Considering the following example.
# Create an example data frame
dt <- data.frame(a = 1:5, b = letters[1:5])
dt
# a b
# 1 1 a
# 2 2 b
# 3 3 c
# 4 4 d
# 5 5 e
str(dt)
# 'data.frame': 5 obs. of 2 variables:
# $ a: int 1 2 3 4 5
# $ b: Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5
# Subset the data frame
list1 <- dt[1:2, ]
list2 <- dt[3:nrow(dt), ]
class(list1)
# [1] "data.frame"
The code to subset dt
works well. However, when I created a one-column data frame from dt
and subset it, you can see that the output automatically becomes a vector.
# Create a one-column data frame
dt2 <- dt[, 2, drop = FALSE]
# Subset the data frame
list3 <- dt2[1:2, ]
list4 <- dt2[3:nrow(dt2), ]
class(list3)
# [1] "factor"
list3
# [1] a b
# Levels: a b c d e
The solution would be add drop = FALSE
when subsetting the data frame to keep the output as a data frame.
# Subset the data frame
list5 <- dt2[1:2, , drop = FALSE]
class(list5)
# [1] "data.frame"
Upvotes: 2