dgr379
dgr379

Reputation: 345

Different number of lines when loading a file into R

I have a .txt file with one column consisting of 1040 lines (including a header). However, when loading it into R using the read.table() command, it's showing 1044 lines (including a header).

The snippet of the file looks like

L*H
no
H*L
no
no
no
H*L
no

Might it be an issue with R?

When opened in Excel it doesn't show any errors as well.

EDIT

The problem was that R read a line like L + H* as three separated lines L + H*.

I used

table <- read.table(file.choose(), header=T, encoding="UTF-8", quote="\n")

Upvotes: 0

Views: 268

Answers (2)

Rich Scriven
Rich Scriven

Reputation: 99331

Based on the data you have provided, try using sep = "\n". By using sep = "\n" we ensure that each line is read as a single column value. Additionally, quote does not need to be used at all. There is no header in your example data, so I would remove that argument as well.

All that said, the following code should get the job done.

table <- read.table(file.choose(), sep = "\n")

Upvotes: 1

tigerloveslobsters
tigerloveslobsters

Reputation: 646

You can try readLines() to see how many lines are there in your file. And feel free to use read.csv() to import it again to see it gets the expected return. Sometimes, the file may be parsed differently due to extra quote, extra return, and potentially some other things.

possible import steps:

  1. look at your data with text editor or readLines() to figure out the delimiter and file type
  2. Determine an import method (type read and press tab, you will see the import functions for import. Also check out readr.)
  3. customize your argument. For example, if you have a header or not, or if you want to skip the first n lines.
  4. Look at the data again in R with View(head(data)) or View(tail(data)). And determine if you need to repeat step 2,3,4

Upvotes: 2

Related Questions