Reputation: 485
I have three tables: "users", "posts", and "likes" almost formatted as: For example the three table entries are:
users (two users): 1. uid: 12,
2. uid: 15.
, and
posts (three posts): 1. pid: 3, publisherId = 12, likers = 2,
2. pid: 6, publisherId = 12, likers = 0,
3. pid: 7, publisherId = 12, likers = 1.
, and
likes (three likes): 1. lid: 1, postId = 3, likerId = 12,
2. lid: 2, postId = 7, likerId = 15,
3. lid: 3, postId = 3, likerId = 15.
What I need is: To get all the posts in a multi dimensional array with an array for the unique publisher (user) and another array for the likers (users also). The output I am looking for is something like:
Array: (
post:(
pid = 3,
publisher = Array (uid = 12),
likers = Array (uid=12, uid=15)
),
post:( ....
)
).
I am already getting that with the following time consuming (I believe):
$sql = "SELECT dev_posts.* FROM posts";
if (!$result = mysql_query($sql)) die("Query failed.");
$response = array();
while($result_array = mysql_fetch_object($result)) {
$entries = array();
foreach($result_array as $key => $value) {
if ($key == "byUserId") {
$publisherID = $result_array->byUserId;
$anotherSql = "SELECT * FROM users WHERE users.uid = $publisherID";
if ($anotherResult = mysql_query($anotherSql)) {
$anothers = array();
while($anotherResult_array = mysql_fetch_object($anotherResult)) {
$another = array();
foreach($anotherResult_array as $anotherKey => $anotherValue) {
$another[$anotherKey] = $anotherValue;
}
$anothers[] = $another;
}
$entries[$key] = $anothers;
}
}
else if ($key == "likes") {
if ($value > 0){
$PID = $result_array->pid;
$anotherSql = "SELECT likes.*, users.* FROM likes LEFT JOIN users ON likes.likeUserId = users.uid WHERE $PID = likes.likePostId";
if ($anotherResult = mysql_query($anotherSql)) {
$anothers = array();
while($anotherResult_array = mysql_fetch_object($anotherResult)) {
$another = array();
foreach($anotherResult_array as $anotherKey => $anotherValue) {
$another[$anotherKey] = $anotherValue;
}
$anothers[] = $another;
}
$entries[$key] = $anothers;
}
}
else {
$entries[$key] = array();
}
}
else {
$entries[$key] = $value;
}
}
$posts[] = $entries;
}
Any suggestions are appreciated. I am still looking for join and left join solutions!
Upvotes: 1
Views: 89
Reputation: 8374
it really depends on what you're looking for:
SELECT user.*, post.*
FROM post
LEFT JOIN user ON (post.publisherid=user.id)
since it's only one publisher per post, this should give the user's data for each and every post.
SELECT post.*, GROUP_CONCAT(likes.likerid) as likerids
FROM post
LEFT JOIN likes ON (likes.postid=post.pid)
GROUP BY post.pid
this will give you rows:
["pid" => 3, "publisherid" => 12, "likerids" => "15,17,19"]
and all you have to do in php then is:
$likerids = explode(',', $row['likerids']);
of course, you can combine both queries into one. However, the second query only works well, if you only need the ids of likers. If you want the user data as well, it might be good, (depending on your actual use case), to collect the likerids first and fetch their user data later
SELECT *
FROM user
WHERE user.uid IN (15,17,19)
Also, you should REALLY REALLY REALLY use prepared statements to protect against sql injections. (this is not bold by accident! this is important) If you don't know what sql injections are, read it up. If anyone finds a query that's vulnerable to user provided input and sql injections, all your users' data can (and most likely will) leak into the darkness that is the internet.
Also, please use pdo or mysqli libraries for your database queries. the mysql library is deprecated and is gone in 7.[something] I believe.
There are a bunch of problems associated with fetching both sides of an m:n relation. I mean, essentially it's easy, just fetch it:
SELECT post.*, user.*
FROM post
LEFT JOIN likes ON (post.pid=likes.postid)
LEFT JOIN user ON (likes.likerid=user.uid)
ORDER BY post.pid
however, this will produce these rows:
pid1, publisherid1, userid1, username1
pid1, publisherid1, userid2, username2
...
pid2, publisherid2, userid1, username1
...
as you will notice, the post itself appears multiple times, once for each liker. This is a problem, which cannot be avoided by standard sql alone, because of the fundamentals of sql (being row-based).
This is essentially the data you want, but I suppose in a more aggregated form. This form also contains lots and lots of redundant data, especially assuming the post data is way bigger than the user data. To gather the data, you would have to check the pid for every row, if it's the same pid as in the row before, you somehow merge the records. ... But I would strongly advise against this approach.
I would also advise against using GROUP_CONCAT
for every single field of user, although it might work. The problem is, that GROUP_CONCAT
needs a delimiter, which YOU need to be different from any character in the username field (or any other field, you want to retrieve). This might or might not be a problem, but it's dirty nonetheless. In any case, you then would have to explode every of those aggregated fields in php, rebuild the users' data to build your wanted structure.
Another solution might be, to create a new field, that holds aggregated userdata as json or something, and with the intelligent use of GROUP_CONCAT
and CONCAT
one could create a hierarchical string for each row, that could be json itself. But this goes beyond this post. (Also I condone such use of databases that aren't made nor designed for this). There is however a JSON data type, that could be interesting ...
Ultimately, in those cases, you let the database server do the work that IMHO should be done by the client.
I would do this:
first we're going to fetch the posts we want, we also add a count for likes and the publisher's user data are included as well (if you add a WHERE
with data, that comes from outside the server like a browser, use prepared statements! also read up on SQL, if you don't understand all or parts of this query!) - I would assume, this is all the data you would show to a user at first. (With the power of caching, showing likers for distinct posts could be quite efficient.)
$pdo = new PDO('#yourdatabasestring#'); // rtfm!
$postresult = $pdo->query(
'SELECT p.*, '.
' pub.uid, pub.username, '.
' COUNT(likers.uid) as likecount '.
'FROM post p '.
'LEFT JOIN user as pub ON (pub.uid=post.publisherid) '.
'LEFT JOIN likes ON (post.pid=likes.postid) '.
'LEFT JOIN user as likers ON (likers.uid=likes.likerid) '
'GROUP BY p.pid '.
'LIMIT 50' // learn about offsets!!!
);
now, put all results into an array
$pids = []; // this will contain post ids for which we want to fetch likes
$posts = [];
while($post = $postresult->fetch()) {
$pids[] = $post['pid'];
$post['likers'] = []; // prepare for later
$posts[$post['pid']] = $post;
}
At this point, this array only contains the data, that was requested in the first query (post, user data of publisher). Next, we query for the likes, we use the temporarily stored post ids.*
$likers = $pdo->query(
'SELECT likes.postid, user.* '.
'FROM likes '.
'LEFT JOIN user ON (likes.likerid=user.uid) '.
'WHERE likes.postid IN ('.implode(',', $pids).')'
);
and fetching them and assigning them to the right post.
while($like = $likers->fetch()) {
$posts[$like['postid']]['likers'][] = $like;
}
now ... this solution should actually work for almost every sql database. GROUP_CONCAT
doesn't provide any benefit here. Two queries are actually quite alright here. If you have a very large set of posts that you want to fetch at once, this might absolutely not be the right approach. For fairly small data sets (some hundred posts or so), this should be very much okay.
*) the WHERE
clause could be replaced by WHERE postid IN ([first query with only poist.pid in select])
. For certain use cases, this could be preferable.
However, for the usual web case, I can't imagine anyone wanting to see more than 50 posts at once with already displayed userdata, likers' data and stuff. don't try to show everything at once. fetch what's necessary, try to cluster information (as I did with the $pids
) to reduce the number of queries. Doing a few well-designed and short-running queries in general beats doing many queries (as in your original code), but also is more appropriate than running one huge query, where most data will (on average) be irrelevant.
Upvotes: 1