Reputation: 1339
I have a large dataset which is composed of measurements from two species. I need the results as numbers in a matrix for some analysis, but somehow through trivial manipulation of the dataframe the numbers get converted to strings and cannot be directly converted back into a matrix. To be clear, I repeat the main steps below:
df<-data.frame(
X1=c(1,2,3),
X2=c(4,5,6),
X3=c(7,8,9)
)
med1<-c("name1", "name2", "name3")
rownames(df)<-med1
df2<-as.data.frame(cbind(t(df), species=c("inv1", "inv2", "inv3")))
The new dataset df2 cannot be reversed the same df as a matrix if the added column is removed. I cannot do operations with the restored dataframe. class(df) [1] "data.frame" class(df2) [1] "data.frame"
as.matrix(df)+1 #this operation works fine
as.matrix(t(df2)[1:3,])+1 #this doesn´t work as figures seem fixed as strings
as.matrix(df)==as.matrix(t(df2)[1:3,]) #but logic operator says they´re identical?
Please, what is happening, and how can I recover df from df2 as a full numeric matrix?
Upvotes: 0
Views: 40
Reputation: 160447
Use cbind
on a frame, not on a matrix. Note that
class(df)
# [1] "data.frame"
class(t(df))
# [1] "matrix" "array"
so the S3 method of cbind
called on t(df)
is using the cbind.matrix
method on dispatch. This keeps it all numeric, which is fine, until you try to combine with the species
vector of strings, which then up-converts all numbers to strings.
Solutions:
cbind
on a frame, not a matrix, such as:
df2 <- cbind(data.frame(t(df)), species=c("inv1", "inv2", "inv3"))
class(df2)
# [1] "data.frame"
str(df2)
# 'data.frame': 3 obs. of 4 variables:
# $ name1 : num 1 4 7
# $ name2 : num 2 5 8
# $ name3 : num 3 6 9
# $ species: chr "inv1" "inv2" "inv3"
Avoid cbind
, just transform
it (if using base R):
df2 <- transform(data.frame(t(df)), species=c("inv1", "inv2", "inv3"))
In which case, you can recover the original df
with:
data.frame(t(df2[,1:3]))
# X1 X2 X3
# name1 1 4 7
# name2 2 5 8
# name3 3 6 9
Upvotes: 1