Reputation: 3

Best way to run multiple queries per second on database, performance wise?

I am currently using Java to insert and update data multiple times per second. Never having used databases with Java, I am not sure what is required, and how to get the best performance.

I currently have a method for each type of query I need to do (for example, update a row in a database). I also have a method to create the database connection. Below is my simplified code.

    public static void addOneForUserInChannel(String channel, String username) throws SQLException {
    Connection dbConnection = null;
    PreparedStatement ps = null;

    String updateSQL = "UPDATE " + channel + "_count SET messages = messages + 1 WHERE username = ?";

    try {
        dbConnection = getDBConnection();           

        ps = dbConnection.prepareStatement(updateSQL);
        ps.setString(1, username);
        ps.executeUpdate();
    } catch(SQLException e) {
        System.out.println(e.getMessage());
    } finally {
        if(ps != null) {
            ps.close();
        }

        if(dbConnection != null) {
            dbConnection.close();
        }
    }
}

And my DB connection

    private static Connection getDBConnection() {
    Connection dbConnection = null;

    try {
        Class.forName(DB_DRIVER);
    } catch (ClassNotFoundException e) {
        System.out.println(e.getMessage());
    }

    try {
        dbConnection = DriverManager.getConnection(DB_CONNECTION, DB_USER,DB_PASSWORD);
        return dbConnection;
    } catch (SQLException e) {
        System.out.println(e.getMessage());
    }

    return dbConnection;
}

This seems to be working fine for now, with about 1-2 queries per second, but I am worried that once I expand and it is running many more, I might have some issues. My questions:

Is there a way to have a persistent database connection throughout the entire run time of the process? If so, should I do this?
Are there any other optimizations that I should do to help with performance?

Thanks

Upvotes: 0

Answers (5)

dbschwartz

Reputation: 783

In addition to connection pooling, another consideration here is the frequency of UPDATEs that you have. An UPDATE in most database engines is really a DELETE followed by an INSERT, and because you are incrementing counters, that could be very expensive. It creates a lot of potential contention via locks, a lot of Disk I/O as well.

If you ensure that your UPDATEs are by a key (preferably a PRIMARY KEY), that will help, but it still does not limit the amount of work the database engine needs to perform that much.

Given a high frequency, a good way to do this is to perform your incrementing of values in-memory or in a lighter weight data engine, and then periodically write to disk. If you use in-memory for accumulating values, the problem will be that in a crash you will lose data, so if your data is critical this is not a great option.

One option that could be very good is to use for Java is MapDB (http://www.mapdb.org). This engine could be used for accumulating the values locally, then read from it and periodically write the accumulated totals to the database, lowering the number of actual UPDATEs. MapDB is very fast, basically a persistent, disk-based Map (it has in-memory options too, but the on-disk store is best for something like this if you need accuracy).

This can be a very fast solution for extremely high volume, keeping your database faster too (UPDATEs make the database less optimized as it has to update indexes very often).

Another option is Redis for in-memory accumulators, also very fast, but this will not run in-process. Redis does have an operation log capability so you will not lose data, but writing to Redis over the network is probably not as fast as the in-process MapDB option.

Upvotes: 0

Braj

Reputation: 46871

The Oracle Database JDBC Developer's Guide and Reference might help you to improve the performance of the queries. Basically you can try with Batching.

Read the chapter Performance Extensions that describes the Oracle performance extensions to the Java Database Connectivity (JDBC) standard.

This chapter covers the following topics:

Update Batching
Additional Oracle Performance Extensions

Some Points:

Don't load the driver class every time when you need a connection.

I have already posted a nice ConnectionUtil class to manage all the connections in a single class for whole application.
Use PreparedStatement instead of Statement that you are already using.

Upvotes: 0

mitalia

Reputation: 503

I don't know what the context is for the rest of your application, but connections should be long-lived. In web applications, typically connections come from a connection pool. If you're building a standalone command line app, then you should consider establishing the connection as a singleton and then using it throughout your program. Again, context matters, so any detail you can give on the bigger picture here would help.

Once you are using a single connection for all your DB traffic, you might consider executing multiple updates as part of a batch. If there is a tight loop somewhere in your code that is doing this, that is definitely the way to go for higher performance. In my experience, it's not uncommon to see 10 - 100x performance increases with batching. See "ExectueBatch" here: http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html

By executing a batch, you can re-use your prepared statement instead of re-preparing it every time. However, note that in its current form, you have a potential SQL injection vulnerability in that you are doing string interpolation based on channel name. You might consider a different schema structure so that you can pass channel name as a value instead of a table name. If you do this, the query will be more secure and have the side-effect of being easily re-used after being prepared.

Upvotes: 2

James

Reputation: 597

I assume you are generally familiar with Java, just not JDBC. If so, I think you should read a JDBC tutorial or a book that covers JDBC, before you do serious Java-RDBMS work.

General knowledge aside, the biggest performance killer in your sample code is:

You are creating a new Connection on every call. A Connection is a very expensive object to create. It should be reused as much as possible.

Resolution: use a connection pool (e.g. "dbcp")

Upvotes: 0

John Farrelly

Reputation: 7459

What you want to do is use a connection pool for your database connections. Database connection pooling means that database connections can stay open across your queries/updates, meaning that you don't incur the overhead of connecting to the database each time. Apache DBCP is a popular one. For example:

package com.journaldev.jdbc.datasource;

import java.io.FileInputStream;
import java.io.IOException;
import java.util.Properties;

import javax.sql.DataSource;

import org.apache.commons.dbcp.BasicDataSource;

public class DBCPDataSourceFactory {

    public static DataSource getDataSource(String dbType){
        Properties props = new Properties();
        FileInputStream fis = null;
        BasicDataSource ds = new BasicDataSource();

        try {
            fis = new FileInputStream("db.properties");
            props.load(fis);
        }catch(IOException e){
            e.printStackTrace();
            return null;
        }
        if("mysql".equals(dbType)){
            ds.setDriverClassName(props.getProperty("MYSQL_DB_DRIVER_CLASS"));
            ds.setUrl(props.getProperty("MYSQL_DB_URL"));
            ds.setUsername(props.getProperty("MYSQL_DB_USERNAME"));
            ds.setPassword(props.getProperty("MYSQL_DB_PASSWORD"));
        }else if("oracle".equals(dbType)){
            ds.setDriverClassName(props.getProperty("ORACLE_DB_DRIVER_CLASS"));
            ds.setUrl(props.getProperty("ORACLE_DB_URL"));
            ds.setUsername(props.getProperty("ORACLE_DB_USERNAME"));
            ds.setPassword(props.getProperty("ORACLE_DB_PASSWORD"));
        }else{
            return null;
        }

        return ds;
    }
}


package com.journaldev.jdbc.datasource;

import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

import javax.sql.DataSource;

public class ApacheCommonsDBCPTest {

    public static void main(String[] args) {
        testDBCPDataSource("mysql");
        System.out.println("**********");
        testDBCPDataSource("oracle");
    }

    private static void testDBCPDataSource(String dbType) {
        DataSource ds = DBCPDataSourceFactory.getDataSource(dbType);

        Connection con = null;
        Statement stmt = null;
        ResultSet rs = null;
        try {
            con = ds.getConnection();
            stmt = con.createStatement();
            rs = stmt.executeQuery("select empid, name from Employee");
            while(rs.next()){
                System.out.println("Employee ID="+rs.getInt("empid")+", Name="+rs.getString("name"));
            }
        } catch (SQLException e) {
            e.printStackTrace();
        }finally{
                try {
                    if(rs != null) rs.close();
                    if(stmt != null) stmt.close();
                    if(con != null) con.close();
                } catch (SQLException e) {
                    e.printStackTrace();
                }
        }
    }

}

Tutorial here: http://www.journaldev.com/2509/jdbc-datasource-example-oracle-mysql-and-apache-dbcp-tutorial

Upvotes: 0

Best way to run multiple queries per second on database, performance wise?

Answers (5)

Some Points:

Related Questions