Reputation: 51
Now I'm trying to enhance performance of my web application, I use spring JPA 2.3.0- Hibernate 5.4.15.Final, Postgres 12 and manage transaction by @Transaction. The web app is deployed on aws beanstalk, run multiple instances at the same time, but the database instance is not scalable. And I use bigSerial type for ID of tables.
For instance, I have a STUDENTS table, ID is bigSerial and some other columns. I got a problems when using
@GeneratedValue(strategy = GenerationType.IDENTITY)
, Hibernate couldn't batch insert when saving a list of entities. And I try to use
@GeneratedValue(strategy = GenerationType.AUTO, generator = "students_id_seq")
@SequenceGenerator(name = "students_id_seq", sequenceName = "students_id_seq")
hibernate.id.new_generator_mappings=false
hibernate.jdbc.batch_size=10
hibernate.order_inserts=true
hibernate.order_updates=true
hibernate.batch_versioned_data=true
It seem Hibernate could batch insert, but the problem is Hibernate execute select nextval ('students_id_seq')
multiple times. If an entity list has 30 items, Hibernate execute select nextval
30 times, and 3 times for batch insert query.
Some statistics:
If using GenerationType.IDENTITY
insert into ...
: execute onceinsert into ...
: execute n timesIf using GenerationType.SEQUENCE/ GenerationType.AUTO
select nextval ('students_id_seq')
: execute onceinsert into ...
: execute onceselect nextval ('students_id_seq')
: execute n timesinsert into ...
: execute n/batch_size timesIn conclusion, If using GenerationType.AUTO
or GenerationType.SEQUENCE
with allocationSize = 1
:
My question is, is there anyway to batch insert but not execute to many select nextval
query ? Something likes GenerationType.IDENTITY
, not execute select nextval
, just batch insert and IDs will be handled by sequence in the database.
When I test with GenerationType.SEQUENCE
and allocationSize=1
(GenerationType.AUTO)
, the application executes too much select nextval
query, I think It is even worse than the IDENTITY strategy.
And for some reasons, I don't want to use allocationSize
, it may lead to duplicate primary key error when run insert query manual or when migrate data or some other cases.
After some research, I found a way to get a value list of a sequence:
select nextval ('students_id_seq') from generate_series(1,10);
We can replace 10 by entityList.size() or number of entities doesn't have ID in the entityList when batch insert, just get enough to use, don't create too much gap between IDs, but I'm not sure whether or not Hibernate supported, if yes, please share me the documentation to reference.
Thank you
https://discourse.hibernate.org/t/batch-insert-execute-too-much-select-nextval-sequence/4232
Upvotes: 5
Views: 6139
Reputation: 619
I think you have to make the "allocationSize" setting in your Java entity the same as the sequence increment setting in the DB and the same as the batch size setting.
In that case, hibernate will call the sequence's next value once per batch (in the case above, once per every 40 records), generate the next 40 id values (without the need to call the DB), populate the entity with these new ID(s) then save the batch in one shot
Upvotes: 0
Reputation: 619
I would say that I have some experience of that point. I was doing insertion for more than 128,000 records. And my target was to enhance the time-consuming to do that. I would try to summarize the case as below :
The code did not use any persist() or save() methods. These records were saved when the @Transactionl method exits
I was using hibernate batch inserts below are the settings in the config map
spring.jpa.properties.hibernate.jdbc.batch_size: "40" spring.jpa.properties.hibernate.order_inserts: "true" spring.jpa.properties.hibernate.order_updates: "true" spring.main.allow-bean-definition-overriding: "true"
I modified the allocation size in my entity Id configuration as below:
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator ="mappingentry_sql_generator")
@SequenceGenerator(name = "mappingentry_sql_generator", sequenceName ="mappingentry_id_seq", allocationSize = 40)
private Long id;
Notes: I set the "allocationSize" in the sequence generator to be equal to the "batch_size" value in the setting
After doing these changes the time was reduced from 55 seconds to 20 seconds which was a great impact
The only thing that I did not understand is that when I checked the value of the id column that was generated by the sequence I did not find any value gap. Every ID value exceeded the previous by 1, not 40. That is what I am currently trying to understand
Upvotes: 0
Reputation: 81950
What you are looking for is the HiLo algorithm for id generation.
For every id generated from a sequence it generates a number of ids on the client without accessing the database.
You configure it on your entity as this:
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "hilo_sequence_generator")
@GenericGenerator(
name = "hilo_sequence_generator",
strategy = "org.hibernate.id.enhanced.SequenceStyleGenerator",
parameters = {
@Parameter(name = "sequence_name", value = "hilo_seqeunce"),
@Parameter(name = "initial_value", value = "1"),
@Parameter(name = "increment_size", value = "3"),
@Parameter(name = "optimizer", value = "hilo")
})
@Id
private Long id;
Upvotes: 2