oligofren
oligofren

Reputation: 22963

Hibernate: adding an object if not already present

Here's the case: I am creating a batch script that runs daily, parsing logfiles and exporting the data to a database. The format of this file is basically

std_prop1;std_prop2;std_prop3;[opt_prop1;[opt_prop2;[opt_prop3;[..]]]

The standard properties map to a table with a column for each property, where each line in the logfile basically maps to a corresponding row. It might look like LOGDATA(id,timestamp,systemId,methodName,callLenght). Since we should be able to log as many optional properties as we like, we cannot map them to the same table, since that would mean adding a row the table every time a new property was introduced. Not to think of the number of NULL references ...

So the additional properties go in another table, say EXTRA_PROPS(logdata_foreign_key,propname,value). In reality, most of the optional properties are the same (e.g. os version, app container, etc), making it somewhat wasteful to log for instance 4 rows in EXTRA_PROPS for each row in LOGDATA (in the case that one on average had 4 extra properties). So what I would like my batch job to do is

for each additionalProperty in logRow:
  see if additionalProperty already exist
  if exists:
    create a reference to it in a reference table
  if not:
    add the property to the extra properties table
    create a reference to it in a reference table

I would then probably have three slightly different tables:

LOGDATA(id,timestamp,systemId,methodName,callLenght)
EXTRA_PROPS(id,propname,value)
LOGDATA_HAS_EXTRA_PROPS(logid,extra_prop_id)

I am not 100% this is a better way of doing it, I would still create N rows in the LOGDATA_HAS_EXTRA_PROPS table for N properties, but at least I would not add any new rows to EXTRA_PROPS.

Even if this might not be the best way (what is?), I am still wondering about the tecnhical side: How would I implement this using Hibernate? It does not have to be superfast, but it would need to chew through 100K+ rows.

Upvotes: 0

Views: 146

Answers (1)

Alex Gitelman
Alex Gitelman

Reputation: 24732

Firstly, I would not recommend using Hibernate for this type of logic. Hibernate is a great product but doing this kind of high load data operations may not be it's strongest point.

From data modeling standpoint, it appears to me that (propname,value) is actually a primary key in EXTRA_PROPS. Basically, you want to express the logic that, for example, hostname + foo.bar.com combination will only appear once in the table. Am I right? That would be PK. So you will need to use that in LOGDATA_HAS_EXTRA_PROPS. Using name alone will not be sufficient for reference.

In Hibernate (if you choose to use it), that can be expressed via composite key using @EmbeddedId or Embeddable on object mapped to EXTRA_PROPS. And then you can have many to many relationship that uses LOGDATA_HAS_EXTRA_PROPS as association table.

Upvotes: 1

Related Questions