Thomas Briggs
Thomas Briggs

Reputation: 119

Best way to read data from a file and store them

I am reading data from a file of students where each line is a student, I am then turning that data into a student object and I want to return an array of student objects. I am currently doing this by storing each student object in an arraylist then returning it as a standard Student[]. Is it better to use an arraylist to have a dynamic size array then turn it into a standard array for the return or should I first count the number of lines in the file, make a Student[] of that size then just populate that array. Or is there a better way entirely to do this.

Here is the code if it helps:

public Student[] readStudents() {
        String[] lineData;
        ArrayList<Student> students = new ArrayList<>();
        while (scanner.hasNextLine()) {
            lineData = scanner.nextLine().split(" ");
            students.add(new Student(lineData));
        }
        return students.toArray(new Student[students.size()]);
    }

Upvotes: 0

Views: 665

Answers (2)

Joop Eggen
Joop Eggen

Reputation: 109532

Use an array for a fixed size array. For students that is not the case, so an ArrayList is more suited, as you saw on reading. A conversion from ArrayList to array is superfluous.

Then, use the most general type, here the List interface. The implementation, ArrayList or LinkedList then is a technical implementation question. You might later change an other implementation with an other runtime behavior.

But your code can handle all kinds of Lists which is really a powerful generalisation.

Here an incomplete list of useful interfaces with some implementations

  • List - ArrayList (fast, a tiny bit memory overhead), LinkedList
  • Set - HashSet (fast), TreeSet (is a SortedSet)
  • Map - HashMap (fast), TreeMap (is a SortedMap), LinkedHashMap (order of inserts)

So:

public List<Student> readStudents() {
    List<Student> students = new ArrayList<>();
    while (scanner.hasNextLine()) {
        String[] lineData = scanner.nextLine().split(" ");
        students.add(new Student(lineData));
    }
    return students;
}

In a code review one would comment on the constructor Student(String[] lineData) which risks a future change in data.

Upvotes: 2

Petr Gladkikh
Petr Gladkikh

Reputation: 1932

Which is better depends on what you need and your data set size. Needs could be - simplest code, fastest load, least memory usage, fast iteration over resultind data set... Options could be

  1. For one-off script or small data sets (tens of thousands of elements) probably anything would do.
  2. Maybe do not store elements at all, and process them as you read them? - least memory used, good for very large data sets.
  3. Use pre-allocated array - if you know data set size in advance - guaranteed least memory allocations - but counting elements itself might be expensive.
  4. If unsure - use ArrayList to collect elements. It would work most efficiently if you can estimate upper bound of your data set size in advance, say you know that normally there is not more than 5000 elements. In that case create ArrayList with 5000 elements. It will resize itself if backing array is full.
  5. LinkedList - probably the most conservative - it allocates space as you go but required memory per element is larger and iteration is slower than for arrays or ArrayLists.
  6. Your own data structure optimized for your needs. Usually the effort is not worth it, so use this option only when you already know the problem you want to solve.

Note on ArrayList: it starts with pre-allocating an array with set of slots which are filled afterwards without memory re allocation. As long as backing array is full a new larger one is allocated and all elements are moved into it. New array size is by default twice the size of previous - normally this is not a problem but can cause out of memory if new one cannot get enough contiguous memory block.

Upvotes: 2

Related Questions