Mondo Duke
Mondo Duke

Reputation: 184

Nested SQL call

EDIT:

As requested, our table schema is,

posts:

comments:

replies


I have the tables posts, comments, and replies in a SQL database. (Obviously, a post can have comments, and a comment can have replies)

I want to return a post based on its id, postid.

So I would like a database function has the inputs and outputs,

input:

postid

output:

post = {
    postid
    post_text
    
    comments: [comment, ...]
}

Where the comment and reply are nested in the post,

comment = {
    commentid, 
    text
    
    replies: [reply, ...]

}
reply = {
    replyid
    reply_text
}

I have tried using joins, but the returned data is highly redundant, and it seems stupid. For instance, fetching the data from two different replies will give,

postid post_text commentid comment_text replyid reply_text
1 POST_TEXT 78 COMMENT_TEXT 14 REPLY1_TEXT
1 POST_TEXT 78 COMMENT_TEXT 15 REPLY2_TEXT

It seems instead I want to make 3 separate queries, in sequence (first to the table posts, then to comments, then to replies)

How do I do this?


Upvotes: 1

Views: 201

Answers (3)

Atmo
Atmo

Reputation: 3971

Your idea to store things in JSON is a good one if you have something to parse it down the line.

As an alternative to the previous answers that involve JSON, you can also get a normal SQL result set (table definition and sample data are below the query):

WITH MyFilter(postid) AS (
VALUES (1),(2) /* rest of your filter */
)
SELECT 'Post' AS PublicationType, postid, NULL AS CommentID, NULL As ReplyToID, post_text
FROM Posts
WHERE postID IN (SELECT postid from MyFilter)
UNION ALL
SELECT CASE ReplyToID WHEN NULL THEN 'Comment' ELSE 'Reply' END, postid, commentid, replyToID, comment_text
FROM Comments
WHERE postid IN (SELECT postid from MyFilter)
ORDER BY postid, CommentID NULLS FIRST, ReplyToID NULLS FIRST

Note: the PublicationType column was added for the sake of clarity. You can alternatively inspect CommentID and ReplyToId and see what is null to determine the type of publication.

This should leave you with very little, if any, redundant data to transfer back to the SQL client.


This approach with UNION ALL will work with 3 tables too (you only have to add 1 UNION ALL) but in your case, I would rather go with a 2-table schema:

CREATE TABLE posts (
postid SERIAL primary key,
post_text text NOT NULL
);
CREATE TABLE comments (
commentid SERIAL primary key,
ReplyToID INTEGER NULL REFERENCES Comments(CommentID) /* ON DELETE CASCADE? */,
postid INTEGER NOT NULL references posts(postid) /* ON DELETE CASCADE? */,
comment_text Text NOT NULL
);


INSERT INTO posts(post_text) VALUES ('Post 1'),('Post 2'),('Post 3');
INSERT INTO Comments(postid, comment_text) VALUES (1, 'Comment 1.1'), (1, 'Comment 1.2'), (2, 'Comment 2.1');
INSERT INTO Comments(replytoId, postid, comment_text) VALUES (1, 1, 'Reply to comment 1.1'), (3, 2, 'Reply to comment 2.1');

This makes 1 fewer table and allows to have level 2 replies (replies to replies), or more, rather than just replies to comments. A recursive query (there are plenty of samples of that on SO) can make it so a reply can always be linked back to the original comment if you want.


Edit: I noticed your comment just a bit late. Of course, no matter what solution you take, there is no need to execute a request to get the replies to each and every comment.

Even with 3 tables, even without JSON, the query to get all the replies for all the comments at once is:

SELECT *
FROM replies
WHERE commentid IN (
    SELECT commentid
    FROM comments
    WHERE postid IN (
        /* List your post ids here or nest another SELECT postid FROM posts WHERE ... */
    )
)

Upvotes: 1

Thorsten Kettner
Thorsten Kettner

Reputation: 95053

The redundant data stems from a cross join of a post's comments and replies. I.e. for each post you join each comment with each reply. Comment 78 does neither relate to reply 14 nor to reply 15, but merely to the same post.

The typical approach to select the data would hence be three queries:

select * from posts;
select * from comments;
select * from replies;

You can also reduce this to two queries and join the posts table to the comments query, the replies query, or both. This again, will lead to selecting redundant data, but may ease data handling in your app.

If you want to avoid joins, but must avoid database round trips, you can glue query results together:

select *
from
(
  select postid as id, postid, 'post' as type, post_text as text from posts
    union all
  select commentid as id, postid, 'comment' as type, comment_text as text from comments
    union all
  select replyid as id, postid, 'reply' as type, reply_text as text from replies
) glued
order by postid, type, id;

At last you can create JSON in your DBMS. Again, don't cross join comments with replies, but join the aggregated comments object and the aggregated replies object to the post.

select p.postid, p.post_text, c.comments, r.replies
from posts p
left join
(
  select
    postid,
    jsonb_object_agg(jsonb_build_object('commentid', commentid, 
                                        'comment_text', comment_text)
                    ) as comments
  from comments
  group by postid
) c on c.postid = p.postid
left join
(
  select
    postid,
    jsonb_object_agg(jsonb_build_object('replyid', replyid, 
                                        'reply_text', reply_text)
                    ) as replies
  from replies
  group by postid
) r on r.postid = p.postid;

Upvotes: 1

Laurenz Albe
Laurenz Albe

Reputation: 247235

The “highly redundant” join result is normally the best way, because it is the natural thing in a relational database. Relational databases aim at avoiding redundancy in data storage, but not in query output. Avoiding that redundancy comes at an extra cost: you have to aggregate the data on the server side, and the client probably has to unpack the nested JSON data again.

Here is some sample code that demonstrates how you could aggregate the results:

SELECT postid, post_text,
       jsonb_agg(
          jsonb_build_object(
             'commentid', commentid,
             'comment_text', comment_text,
             'replies', replies
          )
       ) AS comments
FROM (SELECT postid, post_text, commentid, comment_text,
             jsonb_agg(
                jsonb_build_object(
                   'replyid', replyid,
                   'reply_text', reply_text
                )
             ) AS replies
      FROM /* your join */
      GROUP BY postid, post_text, commentid, comment_text) AS q
GROUP BY postid, post_text;

Upvotes: 1

Related Questions