Salvador
Salvador

Reputation: 1549

How to delete empty first column of text file before reading into R

I have a huge text file that came with an empty first column but it came with a column header. I was told not to delete the column header manually because this text file is used by another application. I can't show it because I am not able to read it in R. I have heard about colClasses but couldn't make it work. I also tried fread from data.table with no luck. Here is a small example of what I am talkin about:

enter image description here

I want to delete the first column with a and a. I have tried this:

require(data.table) 
pp <- fread("myfile.txt", drop = 1)
head(pp)

but get an error: Warning message: In fread("myfile.txt,', drop = 1) : Stopped early on line 3. Expected 524 fields but found 523. Thanks beforehand.

UPDATE: Here is a better reproducible example. I was able to read my dataset into R using pp <- fread("myfile.txt", skip = 1) but my column names shifted right and now my last column is filled with NA's. How can I delete the a column name and shift left all my column names without the NA's? Here is a snapshot and dput of a few records:

    a year   fday  first    sec third
1: 1998    1 21.633 21.535 21.481    NA
2: 1998    2 21.146 20.936 20.838    NA
3: 1998    3 20.725 20.651 20.599    NA
4: 1998    4 20.716 20.653 20.620    NA
5: 1998    5 19.606 19.493 19.459    NA
6: 1998    6 18.501 18.314 18.231    NA

pp <- structure(list(a = c(1998L, 1998L, 1998L, 1998L, 1998L, 1998L
), year = 1:6, fday = c(21.633, 21.146, 20.725, 20.716, 19.606, 
18.501), first = c(21.535, 20.936, 20.651, 20.653, 19.493, 18.314
), sec = c(21.481, 20.838, 20.599, 20.62, 19.459, 18.231), third = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_)), row.names = c(NA, 
-6L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x05f82498>)

The final dataset should look like this:
   year   fday  first    sec third
1: 1998    1 21.633 21.535 21.481    
2: 1998    2 21.146 20.936 20.838    
3: 1998    3 20.725 20.651 20.599    
4: 1998    4 20.716 20.653 20.620    
5: 1998    5 19.606 19.493 19.459    
6: 1998    6 18.501 18.314 18.231 

Upvotes: 1

Views: 774

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388992

So you already have the data that you need but the columns are jumbled up?

Try this -

#Rename column 1 to n-1
names(pp)[-ncol(pp)] <- names(pp)[-1]
#Drop the last column.
pp[[ncol(pp)]] <- NULL
pp

#   year fday  first    sec  third
#1: 1998    1 21.633 21.535 21.481
#2: 1998    2 21.146 20.936 20.838
#3: 1998    3 20.725 20.651 20.599
#4: 1998    4 20.716 20.653 20.620
#5: 1998    5 19.606 19.493 19.459
#6: 1998    6 18.501 18.314 18.231

Upvotes: 0

foreach
foreach

Reputation: 99

pp <- data.table::fread("myfile.txt",skip=1)

Upvotes: 2

Related Questions