glyphx
glyphx

Reputation: 195

Java memory limits?

I am working on a java program that parses files to lists and then inserts data into a DB. This runs on a server with a ton of memory. Are there java limitations I need to be aware of?

Like such that I shouldn't parse, for instance, a GB of data into a list before inserting it into the DB?

Upvotes: 0

Views: 281

Answers (5)

Peter Lawrey
Peter Lawrey

Reputation: 533530

The limits you might need to be aware of are

  • a List cannot have 2^31 entries or more.
  • a JVM does scales up to 32 GB well but not much higher as the cost of GC increases with the heap size (unless you have Azul's Zing)

Tons of memory is 256 - 512 GB these days and I would suggest using off heap memory if you need more than 32 GB in one JVM (or Zing).

Upvotes: 0

Nathan Hughes
Nathan Hughes

Reputation: 96395

You have more limitations than just Java to worry about.

There's network bandwidth usage, hogging your database server CPU, filling up the database transaction log, JDBC performance for mass inserts, slowness while the database updates its indexes or generates artificial keys.

If your inputs get too huge you need to split them into chunks and commit the chunks separately. How big is too big depends on your database.

The way your your artificial keys get allocated can slow the process down, you may need to create batches of values ahead of time, such as by using a hilo generator.

Kicking off a bunch of threads and hammering the database server with them would just cause contention and make the database server work harder, as it has to sort out the transactions and make sure they don't interfere with each other.

Consider writing to some kind of delimited file, then run a bulk-insert utility to load its contents into the database. That way the database actually cooperates, it can suspend updating indexes and checking constraints, and sequences and transactions aren't an issue. It is orders of magnitude faster than JDBC.

Upvotes: 1

phatmanace
phatmanace

Reputation: 5021

Nathans answer is decent - so I'll only add a few bits here...

If you are not doing anything anything terribly sophisticated in your program, then it's might be good practice to write in streaming fashion - in simple terms, read in the input a line at a time and then directly output this to a file, finally calling the database's specific (most of them have one) bulk upload tool.

Reading in all the lines into memory, and then calling insert() over the loop would be pretty inefficient.

you don't give us many clues about why you are reading in this data all in one go - is there a reason for needing to do this?

Upvotes: 1

jessebs
jessebs

Reputation: 527

Not directly, but you may want to tweak the JVM arguments a bit.

What are the Xms and Xmx parameters when starting JVMs? might be a useful reference.

Upvotes: 0

kosa
kosa

Reputation: 66637

It depends on how much memory you have allocated for JVM.

How much memory you can allocate to JVM depends again on Client VM (or) Server VM type.

Check -Xmx and -Xms settings.

Upvotes: 0

Related Questions