Reputation: 18541
I have a relational database model This is the basics of my data-config.xml
<entity name="MyMainEntity" pk="pID" query="select ... from [dbo].[TableA] inner join TableB on ...">
<entity name="Entity1" pk="Id1" query="SELECT [Text] Tag from [Table2] where ResourceId = '${MyMainEntity.pId}'"></entity>
<entity name="Entity1" pk="Id2" query="SELECT [Text] Tag from [Table2] where ResourceId2 = '${MyMainEntity.pId}'"></entity>
<entity name="LibraryItem" pk="ResourceId"
query="select SKU
FROM [TableB]
INNER JOIN ...
ON ...
INNER JOIN ...
ON ...
WHERE ... AND ...'">
</entity>
</entity>
Now, this takes a lot of time.
10000 rows in the first query and then each other inner entities are fetched later (around 10 rows each).
If I use a db profiler I see a the three inner entities query running over and over (3 select sentences than again 3 select sentences over and over)
This is really not efficient.
And the import can run over 40 hrs ()
Now,
What are my options to run it faster .
Thanks.
Upvotes: 0
Views: 236
Reputation: 15789
without changing the schema of the DB, the first thing to try is caching. If the inner entities cache well, gains will be substantial.
Maybe the wiki is not uptodate so you should check the jira issues, namely solr-2382 and maybe have a look at solr-2948 too.
A second path could be trying multithreading DIH, but it's more tricky. At one point this was optional, but later was removed cause it was buggy, and I think now there was some jira issue trying to reimplement it, try look it up, but I recommend caching first.
Upvotes: 1