Reputation: 567
Once more.. I have 2 tables 'blog' and 'comment'. A blog can contains n comments (blog --1:n-- comment). Up to date I use following select to insert the data into solr index:
<entity name="blog" dataSource="mssqlDatasource" pk="id"
transformer="ClobTransformer"
query="SELECT b.id, b.market, b.title AS blogTitle, b.message AS
blogMessage, c.message AS commentMessage, c.secondColumn
FROM blog b LEFT JOIN comment c ON b.id = c.source_id
AND c.source_type = 'blog'">
<field column="blogMessage" name="blogMessage" clob="true" />
<field column="commentMessage" name="commentMessage" clob="true" />
</entity>
The index result looks like:
<doc>
<str name="id">1</str>
<str name="market">12</str>
<str name="title">blog of title 1</str>
<str name="blogMessage">message of blog 1</str>
<str name="commentMessage">message of comment</str>
<str name="scondColumn">Im the second column from comment</str>
</doc>
<doc>
<str name="id">1</str>
<str name="market">12</str>
<str name="title">blog of title 1</str>
<str name="blogMessage">message of blog 1</str>
<str name="commentMessage">message of comment - Im the second comment</str>
<str name="scondColumn">Im the second column from comment</str>
</doc>
I would say this is stupid because I got too many index data with the same blog just the comments are different. Is it possible to set 'comments' as 'subentity' like following:
<entity name="blog" dataSource="mssqlDatasource" pk="id"
transformer="ClobTransformer"
query="SELECT b.id, b.market, b.title AS blogTitle, b.message AS
blogMessage
FROM blog b">
<field column="blogMessage" name="blogMessage" clob="true" />
<entity name="comment" dataSource="mssqlDatasource" pk="id"
transformer="ClobTransformer"
query="SELECT c.message as commentMessage, c.secondColumn
FROM comment c
WHERE c.source_id = ${blog.id}">
<field column="commentMessage" name="commentMessage" clob="true" />
</entity>
</entity>
Is that possible? How would the result looks like (cant test it until monday)?
Upvotes: 4
Views: 1572
Reputation: 261
You're almost there, if you want to fill a multi-valued comment field for each blog document, you want the CachedSqlEntityProcessor
Mine tend to look a lot like this (I'm leaving out the clob transformer bits, but you obviously need them)
<entity name="blog"
query="SELECT b.id,
b.market,
b.title AS blogTitle,
b.message AS
blogMessage
FROM blog b">
<entity name="blog_comment"
query="SELECT c.message as commentMessage,
c.secondColumn,
c.blog_id
FROM comment c"
processor="CachedSqlEntityProcessor"
where="blog_id=blog.id"/>
<entity>
The docs should look like:
<doc>
<str name="id">1</str>
<str name="market">12</str>
<str name="title">blog of title 1</str>
<str name="blogMessage">message of blog 1</str>
<arr name="commentMessage">
<str>message of comment</str>
<str>message of comment - Im the second comment</str>
</arr>
<arr name="secondColumn">
<str>Im the second column from comment</str>
<str>Im the second column from comment</str>
</arr>
</doc>
You may want to do multiple nested entities, one that queries for each column, if you want to avoid repeating values.
Upvotes: 1