Lazy
Lazy

Reputation: 1837

MySQL select rows based on two tables

I'm trying to make a SELECT query based on two other table data. _sites_ array has 5 objects, _cats_ array has 12 objects for now. So, in order to get news based on these objects, I have to make 12 * 5 sql query repeatedly. (For each category & site)

How can I reduce query count?

getSites(function (_sites_) {
    getCategories(function (_cats_) {

        var array = []

        async.each(_sites_, function (site, next) {
            async.each(_cats_, function (cat, next2) {

                var sql = "SELECT * "+
                          "FROM news "+
                          "WHERE category = ? AND site = ? "+
                          "ORDER BY id DESC LIMIT 30"

                db.query(sql, [cat.data, site.data], function (result) {
                    array = array.concat(temp)
                    next2()
                })

            }, function () {
                next()
            })

        }, function () {
            // finished
            console.log(array)
        })

    })
})  


function getCats (pass) { //get used categories
    db.query("SELECT * FROM cat WHERE cat_name in (SELECT news_url.rss_cat FROM news_url)"+
             "ORDER by cat_pos ASC", [], function (result) {
                var array = []

                if(result) {
                    for (var j = 0; j < result.length; j++) {
                        array.push({type: 1, data:result[j].cat_name})
                    };
                }

                pass(array)
    })
}


function getSites (pass) {
    db.query("SELECT * FROM news_url", [], function (result) {

        var array = []

        if(result) {
            for (var j = 0; j < result.length; j++) {
                array.push({type: 0, data:result[j].rss_site})
            };
        }

        pass(array)
    })
}

cats table: (categories)

|id|cat_name|cat_pos|
|1 |Fun     | 1     |
|1 |Bomm    | 2     |

news_url table: (rss site names and urls)

|id|rss_site|rss_url   |rss_cat
|1 |Cnn     |http://...|Fun 
|2 |Fox     |http://...|Bomm
|3 |Cnn     |http://...|Bomm
|4 |Routers |http://...|Fun

news table:

|id|news_site | news_cat | news_content
|1 | Cnn      | Bomm     | Some random news content from Cnn
|2 | Cnn      | Fun      | Some random news content from Cnn
|3 | Fox      | Fun      | Some random news content from Fox

Expected result is a separate N-row limit for each combination of category and site. If it's possible using a few queries.

Upvotes: 1

Views: 94

Answers (3)

John Bollinger
John Bollinger

Reputation: 180171

If you don't want to query separately for each category for each site, then why are you doing that? If you just omit the WHERE clause from the query you are using now, then in one go it will give you all the news for all categories and sites, subject to the 30-row LIMIT you have placed.

It will be trickier if you want a separate 30-row limit for each combination of category and site, as your current approach provides, but I'm inclined to suspect that the LIMIT isn't doing quite what you want it to do now, or at least that it would not be doing so if you had more data.

Edited to add:

If, as you indicate in comments, you really do want a 30-row limit for each combination of category and site then, for mysql, I see only two alternatives:

  • form one giant query by connecting all the individual queries you now perform via the UNION ALL operator. Submit only that query. This is likely your best bet, but you could also
  • use a simple query with no WHERE clause, as I first suggested, and apply the row limits on the node.js side. This approach might be facilitated by ordering additionally by site and category. Although this results in a simpler query, it may involve many rows being transferred and then discarded.

Upvotes: 1

AdamMc331
AdamMc331

Reputation: 16691

It sounds to me like you are looking for a Cartesian Product. You get this when you select from two different tables without defining a relationship between them. So, for a simple example:

SELECT rss_site, cat_name
FROM categories, news_url;

Will return every possible (rss_site, cat_name) combination. I would note, however, that since you have a few repeated rss_site values (cnn in your sample data) you may want to add DISTINCT in your select clause to only get distinct pairs:

SELECT DISTINCT rss_site, cat_name
FROM categories, news_url;

Here is an SQL Fiddle example.

Upvotes: 1

citywall
citywall

Reputation: 235

Your SQL should probably be something like this:

var sql = "SELECT * "+ "FROM news "+ "WHERE category in (?) AND site in (?) "+ "ORDER BY id DESC LIMIT 30"

given that you have two arrays and not just two values.

Upvotes: 0

Related Questions