Fabian Domurad
Fabian Domurad

Reputation: 273

Random elements inside JOIN

I have this code here

INSERT INTO Directory.CatalogTaxonomy (`CatalogId`, `TaxonomyId`, `TaxonomyTypeId`, `IsApprovalRelevant`)
SELECT cat.CatalogId, dep.Id, @department_type, false
FROM Directory.Catalog cat
    JOIN (SELECT * FROM (
        SELECT * FROM Taxonomy.Department LIMIT 10
    ) as dep_tmp ORDER BY RAND() LIMIT 3) AS dep
WHERE cat.CatalogId NOT IN (SELECT CatalogId FROM Directory.CatalogTaxonomy WHERE TaxonomyTypeId = @department_type) 
    AND cat.UrlStatus = @url_status_green 
    AND (cat.StatusId = @status_published 
        OR cat.StatusId = @status_review_required);

And the problem is that, it should for each catalog take the first 10 elements from Department and randomly choose 3 of them, then add to CatalogDepartment 3 rows, each containing the catalog id and a taxonomy id. But instead it randomly chooses 3 Department elements and then adds those 3 elements to each catalog.

The current result looks like this:

1   000de9d7-af8b-4bac-bdbd-e6e361e5bc5e
1   001d4060-2924-4c75-b304-d780454f261b
1   001bc4b8-c1bc-498d-9aee-3825a40587d5
2   000de9d7-af8b-4bac-bdbd-e6e361e5bc5e
2   001d4060-2924-4c75-b304-d780454f261b
2   001bc4b8-c1bc-498d-9aee-3825a40587d5
3   000de9d7-af8b-4bac-bdbd-e6e361e5bc5e
3   001d4060-2924-4c75-b304-d780454f261b
3   001bc4b8-c1bc-498d-9aee-3825a40587d5

As you can see, there are only 3 departments chosen and repeated for every catalog

Upvotes: 0

Views: 35

Answers (1)

forpas
forpas

Reputation: 164099

If you think that the query:

SELECT * FROM (
  SELECT * FROM Taxonomy.Department LIMIT 10
) as dep_tmp 
ORDER BY RAND() LIMIT 3

that you join to Directory.Catalog returns 3 different departments for each catalog then you are wrong.
This query is executed only once and returns 3 random departments which are joined (always the same 3) to Directory.Catalog.
What you can do is after you CROSS JOIN 10 departments to Directory.Catalog, choose randomly 3 of them for each catalog.
Try this:

INSERT INTO Directory.CatalogTaxonomy (`CatalogId`, `TaxonomyId`, `TaxonomyTypeId`, `IsApprovalRelevant`)
WITH cte AS (
  SELECT cat.CatalogId, dep.Id AS TaxonomyId, @department_type AS TaxonomyTypeId, false AS IsApprovalRelevant
  FROM Directory.Catalog AS cat
  CROSS JOIN (SELECT * FROM Taxonomy.Department LIMIT 10) AS dep
  WHERE cat.CatalogId NOT IN (SELECT CatalogId FROM Directory.CatalogTaxonomy WHERE TaxonomyTypeId = department_type) 
    AND cat.UrlStatus = @url_status_green 
    AND (cat.StatusId = @status_published OR cat.StatusId = @status_review_required);
)
SELECT t.CatalogId, t.TaxonomyId, t.TaxonomyTypeId, t.IsApprovalRelevant 
FROM (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY CatalogId ORDER BY RAND()) rn
  FROM cte
) t
WHERE t.rn <= 3

Note that this:

SELECT * FROM Taxonomy.Department LIMIT 10

does not guarantee that you get the first 10 elements from Department because a table is not ordered.

Upvotes: 1

Related Questions