Reputation: 1549
I have a huge text file that came with an empty first column but it came with a column header. I was told not to delete the column header manually because this text file is used by another application. I can't show it because I am not able to read it in R. I have heard about colClasses but couldn't make it work. I also tried fread
from data.table
with no luck. Here is a small example of what I am talkin about:
I want to delete the first column with a and a. I have tried this:
require(data.table)
pp <- fread("myfile.txt", drop = 1)
head(pp)
but get an error: Warning message: In fread("myfile.txt,', drop = 1) : Stopped early on line 3. Expected 524 fields but found 523. Thanks beforehand.
UPDATE:
Here is a better reproducible example. I was able to read my dataset into R using pp <- fread("myfile.txt", skip = 1)
but my column names shifted right and now my last column is filled with NA's. How can I delete the a
column name and shift left all my column names without the NA's?
Here is a snapshot and dput of a few records:
a year fday first sec third
1: 1998 1 21.633 21.535 21.481 NA
2: 1998 2 21.146 20.936 20.838 NA
3: 1998 3 20.725 20.651 20.599 NA
4: 1998 4 20.716 20.653 20.620 NA
5: 1998 5 19.606 19.493 19.459 NA
6: 1998 6 18.501 18.314 18.231 NA
pp <- structure(list(a = c(1998L, 1998L, 1998L, 1998L, 1998L, 1998L
), year = 1:6, fday = c(21.633, 21.146, 20.725, 20.716, 19.606,
18.501), first = c(21.535, 20.936, 20.651, 20.653, 19.493, 18.314
), sec = c(21.481, 20.838, 20.599, 20.62, 19.459, 18.231), third = c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_)), row.names = c(NA,
-6L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x05f82498>)
The final dataset should look like this:
year fday first sec third
1: 1998 1 21.633 21.535 21.481
2: 1998 2 21.146 20.936 20.838
3: 1998 3 20.725 20.651 20.599
4: 1998 4 20.716 20.653 20.620
5: 1998 5 19.606 19.493 19.459
6: 1998 6 18.501 18.314 18.231
Upvotes: 1
Views: 774
Reputation: 388992
So you already have the data that you need but the columns are jumbled up?
Try this -
#Rename column 1 to n-1
names(pp)[-ncol(pp)] <- names(pp)[-1]
#Drop the last column.
pp[[ncol(pp)]] <- NULL
pp
# year fday first sec third
#1: 1998 1 21.633 21.535 21.481
#2: 1998 2 21.146 20.936 20.838
#3: 1998 3 20.725 20.651 20.599
#4: 1998 4 20.716 20.653 20.620
#5: 1998 5 19.606 19.493 19.459
#6: 1998 6 18.501 18.314 18.231
Upvotes: 0