Reputation: 1610
Take some big txt file. Say, this one and read it in:
loc<-[file location]
txt<-read.delim(loc, header = FALSE,stringsAsFactors=FALSE)
If we paste it all together like this, we get a completely sensible output (I've only shown a bit of it):
> paste0(txt[,],collapse = "")
[1] " GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies"
But if rather than using txt[,]
, we just use txt
, we get a vector output that's got a bunch of backslashes (again, I've truncated).
> paste0(txt,collapse = "")
[1] "c(\" GNU GENERAL PUBLIC LICENSE\", \" Version 2, June 1991\", \" Copyright (C) 1989, 1991 Free Software Foundation, Inc.,\", \" 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA\", \" Everyone is permitted to copy and distribute verbatim copies\"
This implies that there's a difference between txt
and txt[,]
. But what is it?
Upvotes: 2
Views: 32
Reputation: 16910
This is because here txt
is a dataframe of one variable, which is also a list of length one, whereas txt[,]
is just a vector (if and only if txt
has only one variable, and that variable is a vector). When you paste()
a list, it gives you that representation of the objects in each element.
I will give a little smaller example to demonstrate:
dat <- data.frame(x = letters[1:3])
paste0(dat[,], collapse = "")
# [1] "abc"
paste0(dat, collapse = "")
# [1] "c(\"a\", \"b\", \"c\")"
Those backslashes are just escaping the internal quotation marks:
cat(paste0(dat, collapse = ""))
# c("a", "b", "c")
Now consider what happens if the dataframe had a second variable:
dat <- data.frame(x = letters[1:3], y = LETTERS[1:3])
paste0(dat[,], collapse = "")
# [1] "c(\"a\", \"b\", \"c\")c(\"A\", \"B\", \"C\")"
paste0(dat, collapse = "")
# [1] "c(\"a\", \"b\", \"c\")c(\"A\", \"B\", \"C\")"
Now we can see what is going on.
When a dataframe has only one variable, dat[,]
will return a vector, while if it has more than one, it still returns a list (a dataframe is also a list):
dat <- data.frame(x = letters[1:3])
str(dat[,])
# chr [1:3] "a" "b" "c"
dat <- data.frame(x = letters[1:3], y = LETTERS[1:3])
str(dat[,])
# 'data.frame': 3 obs. of 2 variables:
# $ x: chr "a" "b" "c"
# $ y: chr "A" "B" "C"
Another example to show this is general list paste behavior:
l <- list(1:3)
l
# [[1]]
# [1] 1 2 3
paste(l)
# [1] "1:3"
paste(l[[1]])
# [1] "1" "2" "3"
Upvotes: 1