Reputation: 477
I have an 8.6GB text file and I'm trying to load it using data.table and fread(). This works well with a 6.0GB file, but not with a larger file. I get the following error:
Registered S3 method overwritten by 'data.table':
method from
print.data.table
data.table 1.16.4 using 4 threads (see ?getDTthreads). Latest news: r-datatable.com
Error: cannot allocate vector of size 880.1 Mb
Could you please help me with what should I do? Do I need a better PC, more RAM or something else?
Upvotes: 1
Views: 86
Reputation: 4147
As @Roland said, it's a memory issue. Depending on what you want to do with the large text-file, there are a couple of ways to handle these cases. For giving a better answer, some example code would be beneficial or even what you want to do with the text/ what is the form of this text (table, book, unstructured data)?
You can try setting R's memory limit to something higher like 14 GB
memory.limit(size = 14000)
Also consider removing large data-objects from your environment if they are not needed anymore with rm()
. You can also save objects as .RDS and rm()
to free up memory
# Save an object to a file
saveRDS(object, file = "my_data.rds")
# Restore the object
readRDS(file = "my_data.rds")
Or even read in the data in rowwise chunks (chunking)
chunk_size <- 1000000
for(i in seq(1, file_size, by = chunk_size)) {
chunk <- fread("yourfile.txt", nrows = chunk_size, skip = i)
# Process chunk here
}
You can also use packages like DuckDB
as described in multiple great answers here like this or with Arrow
as described here. In this great question a user had a folder of multiple large text files, which he wanted to traverse. Maybe it helps.
Upvotes: 0