Zeliax
Zeliax

Reputation: 5386

Writing huge ArrayList to file in a human readable format

I have a program that processes quite a lot of sensor data from a sensor system. I'm currently looking into writing the output from my program to a text file so that I can check if it is processes properly from the program.

Right now I am writing a few identifiers before the ArrayList and then writing the ArrayList to the file using ArrayList.toString().

lineToWrite = identifer1 + ";" + identifier2 + ";" + ArrayList.toString()

The output file contains 21 lines in total, and the ArrayLists are from 100 items to 400.000 items large. Using the toString() method makes it impossible for any of the file editing programs I usually use to open the file and inspect them.

I thought of doing a small processing of the items in the ArrayList:

String lineToWrite = "";

String arrayListString = "\n";
for(String s : sensorLine){
    arrayListString += "\t" + s + "\n";
}

lineToWrite = identifer1 + ";" + identifier2 + ";" + arrayListString;

but it seems like this takes forever for some of the ArrayLists which are large enough. Does anyone have a better/faster approach for doing this or know of a good file viewing program?

I have used the following, which don't have the following problems:

As a small side note to the sensor data: I have in total 2.3 million sensor inputs.

EDIT1:

To extend the problem question I might have to add that it is the part of splitting the enormous array into a single string that proved to be a problem. The program iterates very slowly over the array as it is just increasing the size of the arrayListString on every pass through and that takes up a lot of memory/processing power I guess.

EDIT2:

As for the writing method itself I am using a BufferedWriter(), with placeholders for the actual method variables:

output = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(filename, toAppend), "UTF-8"));

And for the actual writing I am using:

output.append(line)
output.flush();

Upvotes: 0

Views: 1347

Answers (5)

Zeliax
Zeliax

Reputation: 5386

In the end I found a solution.

I used a StringBuilder to surpass the problem of writing a huge string to the file. The approach is as follows:

StringBuilder sb = new StringBuilder();
for(String s : arrayList){
    sb.append("\t" + s + "\n"
}

String line = identifier1 + ";" + identfier2 + ";" + sb.toString();

And for the editor Sublime Text 3 didn't seem to mind too much as long as the lines weren't 400.000 characters long

Upvotes: 0

FreeMan
FreeMan

Reputation: 1457

I think you will have to split your data into chunks and load into editor when needed.Here a good answer. How to read Text File of about 2 GB?

Upvotes: 2

Marco13
Marco13

Reputation: 54639

For some odd reason, nearly all text editors hare horribly slow when you have long lines. Often you can easily edit a file with a million lines, but will encounter problems if the file contains a single line with 100000 characters.

Regarding the performance of writing a file, there are several trade-offs.

It is generally beneficial for performance to write "larger blocks of data". That is: When you want to write 1000 bytes, you should write these 1000 bytes at once, and not one by one. But in this case, you are attempting to build a really huge block of data by assembling a huge string. This may strike back and decrease the performance, becase assembling this string may be expensive due to the many string concatenations.

As Taylor pointed out in his answer, writing the file line-by-line is likely a reasonable trade-off here: The chunks are then still large enough to compensate for the efforts of the write operation in general, and still small enough to avoid string concatenation overheads.

As an example: The time for writing 1 Million lines with a BufferedWriter should hardly be measurable:

import java.io.BufferedWriter;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.util.ArrayList;
import java.util.List;
import java.util.Random;

public class ArrayListToFile
{
    public static void main(String[] args) throws IOException
    {
        List<String> sensorLine = new ArrayList<String>();
        int size = 1000000;
        Random random = new Random(0);
        for (int i=0; i<size; i++)
        {
            sensorLine.add(String.valueOf(random.nextDouble()));
        }

        write("out.txt", sensorLine);
    }

    private static void write(String fileName, Iterable<?> elements)
        throws IOException
    {
        try (BufferedWriter bw = new BufferedWriter(
            new OutputStreamWriter(new FileOutputStream(fileName))))
        {
            String identifier1 = "i1";
            String identifier2 = "i2";

            bw.write(identifier1 + ";" + identifier2 + ";\n");

            for (Object s : elements)
            {
                bw.write("\t" + s + "\n");
            }
        }
    }
}

Upvotes: 1

Joop Eggen
Joop Eggen

Reputation: 109547

Dump the data into a database.

Then you can do interesting things like select the numbers 1000 - 1100, or search values, do avg/min/max. In a database client like Toad.

The SQL query language should not be a problem. A client also not.

Java has embedded, standalone databases; H2 might suffice.

Upvotes: 3

Taylor
Taylor

Reputation: 4087

The problem is you're assembling a very large string into memory, and then writing it all at once, with lots of string manipulation to boot (leading to allocation of memory for each string).

Instead, look into using a Stream. Use a Writer, and you can iterate the array and append to a file as you go, will be much faster.

Here's a good tutorial on the basics: http://www.tutorialspoint.com/java/java_files_io.htm

As to the editor issue, most editors either load the entire file into memory or load it in chunks of lines or bytes. If you have huge lines, you may want to revisit your format.

Upvotes: 5

Related Questions