Ninnakin
Ninnakin

Reputation: 13

Only read lines of (very big) file equal to a specific value (using R)

I have a file with 54 million lines and it is to big to read the whole file, it doesnt fit in memory. Using R, I want to extract som 100 000 lines from the file where the content of one of the columns is equal to a certain number. Does anyone know if this is possible without having to read the entire file? All columns contains integers, if that makes any difference.

The contents of the file looks like:

Date,ProductId,Stock,Price
199501,1,271,5
199501,2,145,50
199501,3,16,42
199501,4,32,45
199501,5,96,62

Upvotes: 1

Views: 306

Answers (1)

Dieter Menne
Dieter Menne

Reputation: 10215

Details of your question are unclear, but in many cases the detour via sqldf is the fastest solution if the file otherwise is well structured.

http://code.google.com/p/sqldf/#Example_13._read.csv.sql_and_read.csv2.sql

If this does not help, you should give more details by posting a short sample of 10 lines including simple code that does what you want, even if it is slow. Someone will jump in and optimize it, but this is not possible without sample data.

Upvotes: 7

Related Questions