Reputation: 313
col.names was added on data.table 1.9.6 so issue is over and everyone super happy :) I think I can now convert all my read.csv calls to fread calls without worries of destruction
using data.table 1.9.4
I'm importing read.csv calls to fread due to HUGE performance improvements we've noticed. Most issues I can handle but I've reached a point where I'm clueless and wonder if anyone has an elegent solution.
My problem is that I have named colClasses but the input has no header (it's a grep function), here's a silly example to make sense:
males.students <- read.csv(pipe("grep Male students.csv"),
col.names=c("id", "name", "gender"),
colClasses=(id="numeric"))
now in fread I still want the named colClasses but I have no col names so just using
males.students <- fread("grep Male students.csv"),
colClasses=(id="numeric"))
fails with
Column name 'id' in colClasses[[1]] not found
How can I fix that? are there plans to add col.names?
Upvotes: 4
Views: 1889
Reputation: 9621
Answering the original question, if the problem is that grep
removes the header, you could use awk
instead, to print the first line and any lines containing "Male":
fread("awk 'NR==1 || /Male/' students.csv"), colClasses=(id="numeric"))
This might help people that still use the old version of data.table
.
Upvotes: 0
Reputation: 49448
Add the names in the command line:
fread('echo "id,name,gender"; grep Male students.csv', colClasses = c(id='numeric'))
Upvotes: 3