Eric Green
Eric Green

Reputation: 7735

converting dataframe columns from matrix to vector

I'm trying to combine a few dataframes. In the process of doing so, I noticed that one dataframe contains matrices rather than vectors. Here's a basic example:

df3 <- structure(list(v5 = structure(c(NA, 0), .Dim = c(2L, 1L), .Dimnames = list(
    c("206", "207"), "ecbi1")), v6 = structure(c(NA, 0), .Dim = c(2L, 
1L), .Dimnames = list(c("206", "207"), "ecbi2"))), .Names = c("v5", 
"v6"), row.names = 206:207, class = "data.frame")

# get class
class(df3[,1])
# [1] "matrix"

I want the columns in df3 to be vectors, not matrices.

Upvotes: 1

Views: 173

Answers (4)

AlxRd
AlxRd

Reputation: 285

Use this.

df3[,1] = as.vector(df3[,1])

The same procedure can be applied generally to the rest of the columns.

Upvotes: 1

akrun
akrun

Reputation: 887971

We can use do.call

do.call(data.frame, df3)

Upvotes: 0

MichaelChirico
MichaelChirico

Reputation: 34763

I think most important is to figure out how you managed to get matrix-type columns in the first place and understand whether this was a desired behavior or a side effect of a mistake somewhere earlier.

Given where you are, you can just use c to undo a given column:

df3$v5 <- c(df3$v5)

Or if this is a problem with all columns:

df3[ ] <- lapply(df3, c)

(lapply returns a list of vectors, and when we pass a list via assignment to a data.frame, it interprets each list element as a column; df3[ ] returns all columns of df3. We could also do df3 <- lapply(df3, c), but using [ ] is more robust -- if we make a mistake and somehow return the wrong number of columns, an error will be thrown, where as simply using df3 would have simply overwritten our data.frame silently in case of such an error)

Lastly, if only some columns are matrix-type, we can replace only those columns like so:

mat.cols <- sapply(df3, is.matrix)
df3[ , mat.cols] <- lapply(df3[ , mat.cols], c)

As pertains to the relation between this approach and that using as.vector, from ?c:

c is sometimes used for its side effect of removing attributes except names, for example to turn an array into a vector. as.vector is a more intuitive way to do this, but also drops names.

So given that the names don't mean much in this context, c is simply a more concise approach, but the end result is practically identical.

Upvotes: 3

bramtayl
bramtayl

Reputation: 4024

Just apply as.vector

df3[] = lapply(df3, as.vector)

Upvotes: 3

Related Questions