Code Junkie
Code Junkie

Reputation: 7788

How to handle processing large csv file or read large CSV file in chunks

I have very large csv files that I'm trying to iterate through. I'm using opencsv and I'd like to use CsvToBean so that I can dynamically set the column mappings from a database. The question I have is how to do this without grabbing the entire file and throwing it into a list. I'm trying to prevent memory errors.

I'm currently passing the entire result set into a list like so.

List<MyOption> myObjects = csv.parse(strat, getReader("file.txt"));

for (MyObject myObject : myObjects) {
    System.out.println(myObject);
}

But I found this iterator method and I'm wondering if this will just iterate each row rather than the entire file at once?

Iterator myObjects = csv.parse(strat, getReader("file.txt")).iterator();

while (myObjects.hasNext()) {
    MyObject myObject = (MyObject) myObjects.next();
    System.out.println(myObject);
}

So my question is what is the difference between Iterator and list?

Upvotes: 3

Views: 13713

Answers (2)

Anuswadh
Anuswadh

Reputation: 552

"what is the difference between Iterator and list?"

A List is a data structure that gives the user functionalities like get(), toArray() etc.

An iterator only can allow the user to navigate through a data-structure provided the data structure implements Iterator interface (which all the data structures do)

so List<MyOption> myObjects = csv.parse(strat, getReader("file.txt")); physically stores the data in myObjects

and Iterator myObjects = csv.parse(strat, getReader("file.txt")).iterator(); just uses the iterator functionality of csv.parse

Upvotes: 1

Eran
Eran

Reputation: 393781

The enhanced for loop (for (MyObject myObject : myObjects)) is implemented using the Iterator (it requires that the instance returned by csv.parse(strat, getReader("file.txt")) implements the Iterable interface, which contains an iterator() method that returns an Iterator), so there's no performance difference between the two code snippets.

P.S

In the second snippet, don't use the raw Iterator type, Use Iterator<MyObject> :

Iterator<MyObject> myObjects = csv.parse(strat, getReader("file.txt")).iterator();

while (myObjects.hasNext()) {
    MyObject myObject = myObjects.next();
    System.out.println(myObject);
}

Upvotes: 1

Related Questions