Why combine produces a different behavior from readLines() function

Question

I am learning R and so far I am not having any trouble in catching up besides the following problem that I am hopeful someone out there will help me to understand.

If I create a character vector in the following way test1 <- c("a", "b", "c") I get one vector of type character and I can access to each member of the vector through an indexer test1[n].

That makes sense and does what I understand it should do.

However if I do test2 <- readLines("file1.txt") where file1.txt contains one line (several random words space separated) I get one vector of class character (same as the first case) and I can't use an indexer (unless there's a way and I don't know about it yet).

Questions:

Why both are char type based but they are stored differently
How one could tell them apart without knowing how they have been created
Besides using a strsplit() is there a way to break it down like c() does at loading time from a file?

Any help to understand the insides of this language is wildly appreciated!

zero323 · Accepted Answer

Why both are char type based but they are stored differently

Both are stored in exactly the same way. R has no specific type to represent a single character and as a consequence characters are not a collections.

In the first case you have simply a character vector of length 3 where each element has size 1

test1 <- c("a", "b", "c")
typeof(test1)
# [1] "character"
length(test1)
# [1] 3
nchar(test1)
# [1] 1 1 1

and in the second case a character vector of length equal to number of lines in an input file and each element has size equal to length of string:

writeLines("foobar", con="file1.txt")
test2 <- readLines("file1.txt")
typeof(test2)
# [1] "character"
length(test2)
# [1] 1
nchar(test2)
# [1] 6

Besides using a strsplit() is there a way to break it down like c() does at loading time from a file?

If you have fixed size elements you can try readBin but generally speaking strisplit is the way to go:

f <- "file1.txt"
readBin(f, what = 'raw', size = 1, n = file.info(f)$size) %>% sapply(rawToChar)
# [1] "f"  "o"  "o"  "b"  "a"  "r"  "
"

Why combine produces a different behavior from readLines() function

Answers (1)

Related Questions