Reputation: 1259
I have a database with two tables:
adjectives - ID primary key; NAME varchar; adjectives_reviews - ID primary key; ADJECTIVE_ID foreign key; REVIEW_ID foreign key; COUNT int
I want to get a list of distinct ADJECTIVE_ID from adjectives_reviews with a sum of the values in the 'COUNT' column across all rows with the same ADJECTIVE_ID.
The adjectives table has 46,000 rows and adjectives_reviews has ~2,849,708 .
I am using the following HTSQL query:
/adjectives{name,sum(adjectives_reviews.count)}
which gets translated to SQL:
SELECT `adjectives`.`name`,
COALESCE(`adjectives_reviews`.`sum`, 0)
FROM `adjectives`
LEFT OUTER JOIN (SELECT CAST(SUM(`adjectives_reviews`.`count`) AS SIGNED INTEGER) AS `sum`,
`adjectives_reviews`.`adjective_id`
FROM `adjectives_reviews`
GROUP BY 2) AS `adjectives_reviews`
ON (`adjectives`.`id` = `adjectives_reviews`.`adjective_id`)
ORDER BY `adjectives`.`id` ASC
This query produces the result I want but it is too slow. It is so slow in fact that the HTTP server times out and I cannot get the output CSV file which I need. I am looking for a way to optimize this query or alternatively a way to get the output as CSV without using the HTTP interface of HTSQL.
Upvotes: 0
Views: 180
Reputation: 1259
I was able to get the result out as CSV by using the native mysql client and adding 3 lines to the query:
SELECT `adjectives`.`name`,
COALESCE(`adjectives_reviews`.`sum`, 0)
FROM `adjectives`
LEFT OUTER JOIN (SELECT CAST(SUM(`adjectives_reviews`.`count`) AS SIGNED INTEGER) AS `sum`,
`adjectives_reviews`.`adjective_id`
FROM `adjectives_reviews`
GROUP BY 2) AS `adjectives_reviews`
ON (`adjectives`.`id` = `adjectives_reviews`.`adjective_id`)
ORDER BY `adjectives`.`id` ASC
INTO OUTFILE '/tmp/adjectives.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
It ran reasonably fast.
Upvotes: 0