fypfyp
fypfyp

Reputation: 206

Putting several tables/columns in to CSV file

I'm currently doing a project where I collect a lot of information about processes and CPU usage, battery usage etc in to tables.

For example, "Log table" contains the columns 'Process', 'Pid', 'Battery' and another table, "CPU Table", contains for example 'Process', 'CPU Time', 'CPU Usage'. The program writes to all these tables at the same time so each reading has it's own line across all the tables.

I have created a program that takes all this information and creates a separate CSV file for all the processes. Within these files, I want to have all the information from all four tables about that particular process.

The problem I am having is that for some reason, when I attempt to bring the information in, one of the tables duplicates it's records for every record from the other table.

problem

As you can see, the information from "Log table" writes out correctly, but the information from "CPU table" (the 7 furthest right columns) repeat the first record for every entry of "Log", and then does the same for the second, and for every record of Log.

Procs[i] contains all unique process names.

Currently, the sql statement I am using is and the basic code surrounding is:

String query = "SELECT * FROM log, cpuinfo WHERE log.Process = " + "'" + procs[i] + "'" + " AND cpuinfo.Process = " + "'" + procs[i] + "'";
                System.out.println(query);
                rs = sment.executeQuery(query);

                writer = new CSVWriter(new FileWriter(procs[i] + ".csv"), ',');
                //writer = new CSVWriter(procs[i] + ".csv");
                //System.out.println("Writing.....");



                //rs.beforeFirst();

                writer.writeAll(rs, true);
                //writer.

                writer.flush();

Can anyone see why the program is doing this and writing the CSV in this way? Any help would be hugely appreciated, been stuck on this program for weeks now to no avail and Google isn't helping!

Thanks.

If you need any information or the code to look at please do ask!

Upvotes: 4

Views: 1003

Answers (2)

Michael McGowan
Michael McGowan

Reputation: 6608

You are implicitly performing an inner join of the two tables log and cpuinfo here, and you need to decide how to properly associate one record from log with one record from cpuinfo. Your query as written effectively creates a result record for every record from log joined up with every record from cpuinfo, and then your WHERE clause is restricting that result set to only those records where the processes match up.

This doesn't seem to be what you want; it seems to me that you probably want to associate records where the process matches up AND the timestamps match as well. Naively this might look like the following:

SELECT * FROM log, cpuinfo WHERE log.Process = ?  AND cpuinfo.Process = ? 
  AND log.Timestamp = cpuinfo.Timestamp

However, I believe your solution is actually a bit more complicated than naively comparing timestamps. It looks like there probably is a way to match timestamps from one table to another, but it appears the associated timestamps might not be identical. For instance, the earliest timestamp in log looks like 26:04.2 as opposed to 26:04.3 in cpuinfo. You would need to decide, based upon application logic, the way in which it makes sense to associated two timestamps with one another. This would then affect your WHERE clause accordingly.

Upvotes: 2

Anugoonj
Anugoonj

Reputation: 575

The following should work and remove the duplicate records.

Select * FROM log, cpuinfo WHERE log.Process = cpuinfo.Process and      log.Process = ?

Upvotes: 0

Related Questions