user1454926
user1454926

Reputation: 199

Overhead of data transfer from a MySQL database using PreparedStatement

This is not a question about query optimization. Rather, a sanity check about what to expect of data transfer rates from MySQL 5.5.27 (Amazon RDS).

When running a particularly heavy query, MySQL Workbench is showing data transfer rate of about 1MB/s and the query runs for about 420 seconds. This adds up to about 420M bytes of data being transferred.

If this data is saved into a simple text file, the size of the file ends up being less than 7M bytes. I certainly expected to see some overhead due to metadata of the ResultSet, JDBC driver mechanisms, etc. But 420M vs. 7M seems like an extraordinary terrible ratio to me. Or, is this normal?

Any feedback is much appreciated. Much thanks!

PS. More details:
-the JDBC Driver is mysql-connector-java-5.1.13
-the data is transferred between Amazon RDS and an EC2 instance
-Java 1.6 PreparedStatement is used to execute the query

Upvotes: 1

Views: 2017

Answers (1)

Gord Thompson
Gord Thompson

Reputation: 123484

Wireshark is a wonderful free and open-source (GPL) network analysis tool that can be used to great effect in cases like this. I ran the following test to see how much traffic a "typical" JDBC connection to a "normal" MySQL server might generate.

I created a table named jdbctest in MySQL (5.5.29-0ubuntu0.12.04.2) on my test server.

CREATE TABLE `jdbctest` (
  `id` int(11) DEFAULT NULL,
  `textcol` varchar(6) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

I populated it with 100,000 rows of the form

    id  textcol
------  -------
     1  ABCDEF
     2  ABCDEF
     3  ABCDEF
...
100000  ABCDEF

At 4 bytes per id value and 6 bytes per textcol value, retrieving all 100,000 rows should represent somewhere on the order of 1 MB of data.

I fired up Wireshark, started a trace, and ran the following Java code which uses mysql-connector-java-5.1.26:

import java.sql.*;

public class mysqlTestMain {
    static Connection dbConnection = null;

    public static void main(String[] args) {
        try {
            String myConnectionString = "";
            myConnectionString =
                    "jdbc:mysql://192.168.1.3:3306/mytestdb";
            dbConnection = DriverManager.getConnection(myConnectionString, "root", "whatever");
            PreparedStatement stmt = dbConnection.prepareStatement("SELECT * FROM jdbctest");
            ResultSet rs = stmt.executeQuery();
            int i = 0;
            int j = 0;
            String s = "";
            while (rs.next()) {
                i++;
                j = rs.getInt("id");
                s = rs.getString("textcol");
            }
            System.out.println(String.format("Finished reading %d rows.", i));
            rs.close();
            stmt.close();
            dbConnection.close();
        } catch (SQLException ex) {
            ex.printStackTrace();
        }
    }
}

The console output confirmed that I had retrieved all 100,000 rows.

Looking at the summary of the Wireshark trace, I found:

Packets captured: 1811
Avg. packet size: 992.708 bytes
Bytes: 1797795

The breakdown by direction was

                   packets    bytes
                   -------    -----
from me to server      636    36519
from server to me     1175  1761276

So it appears that to retrieve my ~1 MB of data I received 1.72 MB of total network traffic from the MySQL server. That ~72% overhead on the download (or ~76% including traffic in both directions) is certainly nowhere near the ~5900% overhead suggested by your (rate * time) calculation.

I strongly suspect that the ~1 MB/s rate being reported by MySQL Workbench is not the overall average transfer rate over the entire time. The best way to determine the overhead in your particular circumstance would be to use a tool like Wireshark and measure it yourself.

Upvotes: 1

Related Questions