Adam
Adam

Reputation: 177

Writing to CSV in multiple columns inside a loop java

I am trying to write a two column array to .csv file. The array has an arbitrary number of columns, ranging from a few to perhaps a few thousand. Currently I have code that writes a single array to a .csv file as follows. The parameter n in the generateCSV method represents the point in the loop, starting from n=1.

public class MakeCSVFile {

public static void generateCSV(File myFile, double[][] ProtonTracking,int n) throws IOException{
    PrintWriter pw = new PrintWriter(new FileWriter(myFile));
    pw.print(String.format("Trial Number :, %d\n",  n));
    for(int i= 0; i < Array.getLength(ProtonTracking); i++) {
      pw.print(String.format("%f, %f%n", ProtonTracking[i][0], ProtonTracking[i][1]));
    }
    pw.flush();  
    pw.close(); 
    }
}

This code write the first array to a .csv file correctly for n=1. However, for the next stage of the loop when n=2, the array is written over the same location as for n=1, and so only the most recently arrray is saved to the .csv file. I would like to tell this method to write the array for n=2 in the two columns next to the array for n=1 etc. So that in the end the .csv file contains the array for all the n's.

In other words, for n=2, I would like to start writing to the .csv file in column 3, not column 1. I tried to do this by using commas typed into the loop such as:

for(int k= 0; k<n; k++) {
      pw.print(String.format(", ,");
    }

but this overwrites all the previous columns.

Upvotes: 2

Views: 4670

Answers (1)

Jason C
Jason C

Reputation: 40436

These is because you reopen the file every time with:

PrintWriter pw = new PrintWriter(new FileWriter(myFile));

When you do that, the existing file is truncated and the new data replaces it (by default, FileWriter opens files in this way). The easiest option is to probably have the FileWriter open the file in append mode:

PrintWriter pw = new PrintWriter(new FileWriter(myFile, true));

An alternative is to open the file once before you write all your records, and pass a PrintWriter (or FileWriter) to generateCSV instead of the File:

public static void generateCSV(PrintWriter pw, double[][] ProtonTracking,int n) throws IOException{
    pw.print(String.format("Trial Number :, %d\n",  n));
    for(int i= 0; i < Array.getLength(ProtonTracking); i++) {
        pw.print(String.format("%f, %f%n", ProtonTracking[i][0], ProtonTracking[i][1]));
    }
    pw.flush();
    // <- note that .close() was removed! don't close until completely done.
}

Choose whichever of those options makes the most sense for your application.


In a comment you asked:

However, these are all contained vertically one after the other in two columns. Is it possible to display these side by side in many two columns sections?

With your current structure, not easily. It is generally non-trivial to insert data into the middle of a file. You could conceivably create a method that loaded the lines of an existing file, appended data to each line, then wrote all the lines back out (performance will eventually suffer greatly). However, a cleaner approach would be one of these three options:

  1. Gather all of the data you want to write to a file ahead of time. Perhaps even create a small Trial class that holds information about a single trial (this will also give your code some clearer semantic meaning, which can help eliminate confusion and mistakes in the long run). Then, when it comes time to write the entire file, you are free to iterate through your collection of trials as you see fit and write the data out in any order you wish.

  2. Redefine your concepts here. Modify your code to store all pairs for a given trial in a single row (you can even store n itself in the first column, and other interesting info, e.g. the timestamp of the record, etc. in other columns). A CSV file stores data in tabular format. Typically, in a table, each row is a record, and each column is a field. I am just guessing, but in your application, it seems that a "trial" is an independent record of some sort, and each trial has a set of properties (fields) associated with it. With this in mind, it is actually quite appropriate to have a very wide table with each trial as a row and each datum of a trial as a column. This is why, for example, graph generators in spreadsheet software take columns as data fields by default.

  3. The first row of a table is typically reserved for a header. It is unusual for you to store "Trial x" headers at intervals throughout the table. With this in mind, perhaps you should consider a separate CSV file for every trial. Generate the file name based on n.

Options 2 and 3 may not always be appropriate; for example, you may be constrained by some third-party application that expects the data to be in a certain format. However, if you have the flexibility, it sounds like a wide table with one row per trial, or separate files per trial, is a more natural representation of your data (and also, of course, much easier to write). If your goal with the CSV is simply to arbitrarily format your output for easy human viewing (as opposed to data storage / analysis) in a spreadsheet program, you may also not find these options appropriate.

Option 1, if that is the way you go, will require you to have all your data available ahead of time (which may or may not be possible, depending on the nature of your application, memory requirements, etc.), so that you can "interleave" your trials in each row of output.

Upvotes: 1

Related Questions