Reputation: 369
I'm working through exporting some data from a local WordPress (bbPress) install and am experiencing duplicate data in my GROUP_CONCAT columns (which does not exist in the database).
Here's the query:
SELECT
a.`ID`, a.`post_date`, a.`post_content`, a.`post_title`,
a.`post_status`, a.`post_name`, a.`post_type`, a.`post_parent`,
GROUP_CONCAT(d.`meta_key` SEPARATOR '{|}') AS `post_meta_keys`,
GROUP_CONCAT(d.`meta_value` SEPARATOR '{|}') AS `post_meta_values`,
b.`user_login`, b.`user_pass`, b.`user_nicename`, b.`user_email`,
b.`user_registered`, b.`display_name`,
GROUP_CONCAT(c.`meta_key` SEPARATOR '{|}') AS `user_meta_keys`,
GROUP_CONCAT(c.`meta_value` SEPARATOR '{|}') AS `user_meta_values`
FROM
`wp_posts` a
INNER JOIN
`wp_users` b ON a.`post_author` = b.`ID`
INNER JOIN
`wp_usermeta` c ON a.`post_author` = c.`user_id`
INNER JOIN
`wp_postmeta` d ON a.`ID` = d.`post_id`
WHERE
`post_type` = 'forum' OR
`post_type` = 'topic' OR
`post_type` = 'reply'
GROUP BY
a.`ID`
I'm trying to get one big view that has the Post, its meta information, and its author's information in each row. All is good except the GROUP_CONCAT columns have multiple duplicates. For example: from the first row of the result set - the column user_meta_keys
has the values:
[0] => nickname
[1] => first_name
[2] => last_name
[3] => description
[4] => rich_editing
[5] => comment_shortcuts
[6] => admin_color
[7] => use_ssl
[8] => show_admin_bar_front
[9] => locale
[10] => wp_capabilities
[11] => wp_user_level
[12] => dismissed_wp_pointers
[13] => show_welcome_panel
[14] => session_tokens
[15] => wp_dashboard_quick_press_last_post_id
[16] => community-events-location
[17] => managenav-menuscolumnshidden
[18] => metaboxhidden_nav-menus
[19] => nav_menu_recently_edited
[20] => users_per_page
[21] => wp__bbp_topic_count
[22] => wp__bbp_reply_count
[23] => nickname
[24] => first_name
[25] => last_name
[26] => description
[27] => rich_editing
[28] => comment_shortcuts
[29] => admin_color
[30] => use_ssl
[31] => show_admin_bar_front
[32] => locale
[33] => wp_capabilities
[34] => wp_user_level
[35] => dismissed_wp_pointers
[36] => show_welcome_panel
[37] => session_tokens
[38] => wp_dashboard_quick_press_last_post_id
[39] => community-events-location
[40] => managenav-menuscolumnshidden
[41] => metaboxhidden_nav-menus
[42] => nav_menu_recently_edited
[43] => users_per_page
[44] => wp__bbp_topic_count
[45] => wp__bbp_reply_count
[46] => nickname
[47] => first_name
[48] => last_name
[49] => description
[50] => rich_editing
[51] => comment_shortcuts
[52] => admin_color
[53] => use_ssl
[54] => show_admin_bar_
Nickname is duplicated 3 times and many of the other fields are too. I've done some searching and it looks like others have experienced this too. Some solve it by using "DISTINCT" on the grouped column - but that won't work for me because some of the values are actually duplicated.
So my questions are:
1) Can someone explain to me why the GROUP_CONCAT column has so many duplicates and if the query is written incorrectly - I would at least figure the counts to match between the 'keys' columns and the 'values' columns - but they don't either.
2) Can the query be adjusted to accomplish what I'm looking for?
Thank you for your time!
-- EDIT #1 -- Well the one mistake is I forgot to set the GROUP_CONCAT limit much higher in MySQL - so now that I'm upping that I think I'm getting closer.
-- EDIT #2 -- It looks like the GROUPS are duplicated for as many rows as there are in each alternate table? meaning ... there are 8 rows in the postmeta table that match the post id so there are 8 duplicate groups in the usermeta results ... alternatively there are 23 rows in the usermeta table that match the user id so there are 23 duplicate groups in the postmeta results ... I'm still currently looking for a way to correct this.
-- EDIT #3 --
Here is an added MCVE:
-- Original Query: http://sqlfiddle.com/#!9/386b98/2
-- Updated Query: http://sqlfiddle.com/#!9/386b98/3
Upvotes: 1
Views: 988
Reputation: 369
The duplicate problem occurs because I was ignorant of the way JOINS are processed in SQL - (which I am still learning) - this stack overflow answer really helped me (by Martin Smith) sql joins as venn diagram
It looks like when it is joining the data across the multiple rows of the user_meta
table, each post_meta
table row is concatenated that many times and vice versa. So my current solution is to remove the INNER JOIN from those two tables and simply add them into the SELECT part of the statement - like this:
SELECT
a.`ID`, a.`post_date`, a.`post_content`, a.`post_title`,
a.`post_status`, a.`post_name`, a.`post_type`, a.`post_parent`,
(SELECT GROUP_CONCAT(d.`meta_key` SEPARATOR '{|}') FROM `wp_postmeta` d
WHERE d.`post_id` = a.`ID`) AS `post_meta_keys`,
(SELECT GROUP_CONCAT(d.`meta_value` SEPARATOR '{|}') FROM `wp_postmeta` d
WHERE d.`post_id` = a.`ID`) AS `post_meta_values`,
b.`user_login`, b.`user_pass`, b.`user_nicename`, b.`user_email`,
b.`user_registered`, b.`display_name`,
(SELECT GROUP_CONCAT(c.`meta_key` SEPARATOR '{|}') FROM `wp_usermeta` c
WHERE c.`user_id` = a.`post_author`) AS `user_meta_keys`,
(SELECT GROUP_CONCAT(c.`meta_value` SEPARATOR '{|}') FROM `wp_usermeta` c
WHERE c.`user_id` = a.`post_author`) AS `user_meta_values`
FROM
`wp_posts` a
INNER JOIN
`wp_users` b ON a.`post_author` = b.`ID`
WHERE
`post_type` = 'forum' OR
`post_type` = 'topic' OR
`post_type` = 'reply'
GROUP BY a.`ID`
I didn't know you could do that - this produces a much quicker / cleaner result.
-- Working Example: http://sqlfiddle.com/#!9/386b98/3
Upvotes: 2