Pivoting data from different tables

Question

Here's my question:
I have a database structured like this:

|ID_Brand| Description
------------------------------
|1       | CoolBrandName
|2       | AlsoCoolBrandName 
|...     | ...

Linked to this table I created a File table like this

|ID_file | ID_BRAND | DESCRIPTION | FILECONTENT | ID_CATEGORY
-------------------------------------------------------------
|1       | 1        | File1       |0x0FF0A0B2..| 1 
|2       | 1        | File2       |0x2F1A000C..| 2
|3       | 2        | File3       |0X5FB43002..| 1
|4       | 1        | File4       |0x93EEFD13..| 1
|...     |...       | ...         |...         | ...

Eventually the Category table, linked to the above table with the ID_CATEGORY association.

|ID_CATEGORY | DESCRIPTION
--------------------------
|1           | Category1
|2           | Category2
|3           | Category3
|4           | Category4
|...         | ...

I need to show the data on these tables like a pivot table, grouped on the category. For example with the data above i need to output something like this:

|BrandName         | Category1 | Category2 | Category3 | Category4
------------------------------------------------------------------
|CoolBrandName     | File1     | File2     | *NULL*    | *NULL* 
|AlsoCoolBrandName | File3     | *NULL*    | *NULL*    | *NULL*
|CoolBrandName     | File4     | *NULL*    | *NULL*    | *NULL*

The category table has a fixed number of rows.

I tried starting with the examples on http://msdn.microsoft.com/en-us/library/ms177410.aspx but with no luck.

I need to achieve this via SQL (I'm using sql server 2008 r2) or via Linq either.

Can anyone help me going through this?

I appreciate any suggestion.

Thanks in advance V.

Andriy M · Accepted Answer

PIVOT queries are akin to GROUP BY queries except that grouping is implicit in the former. Data are grouped by all columns but one, and that one becomes the aggregated one. I elaborate more on this in my other answer.

In your case, there are already two obvious columns the result set should be grouped by. It's brand name and category name. But that is not enough, because some files, as per your example, can belong to the same brand/category group and you still want to feature every individual file in your output. So, apparently, there must be another criterion to group by.

To me, the most obvious choice for the third criterion would be a ranking position of some sort. And you can see from the resulting query below that I chose to rank the files based on their alphabetical order.

So, here's a solution that worked for me. First, my testing environment – DDL and sample data:

DECLARE @Brand TABLE (
  ID_Brand int IDENTITY,
  Description varchar(50)
);
DECLARE @Category TABLE (
  ID_Category int IDENTITY,
  Description varchar(50)
);
DECLARE @File TABLE (
  ID_File int IDENTITY,
  ID_Brand int,
  FileContent varbinary(max) DEFAULT (CAST(NEWID() AS varbinary)),
  Description varchar(50),
  ID_Category int
);

INSERT INTO @Brand (Description) VALUES
  ('CoolBrandName'),
  ('AlsoCoolBrandName');
INSERT INTO @Category (Description) VALUES
  ('Category1'),
  ('Category2'),
  ('Category3'),
  ('Category4');
INSERT INTO @File (ID_Brand, ID_Category, Description) VALUES
  (1, 1, 'File1'),
  (1, 2, 'File2'),
  (2, 1, 'File3'),
  (1, 1, 'File4');

And this is the query to obtain the required output:

WITH ranked AS (
  SELECT
    *,
    rnk = ROW_NUMBER() OVER (PARTITION BY ID_Brand, ID_Category ORDER BY Description)
  FROM @File
),
joined AS (
  SELECT
    BrandName    = b.Description,
    CategoryName = c.Description,
    FileName     = f.Description,
    FileRank     = f.rnk
  FROM ranked f
    INNER JOIN @Brand b    ON f.ID_Brand    = b.ID_Brand
    INNER JOIN @Category c ON f.ID_Category = c.ID_Category
)
SELECT
  BrandName,
  Category1,
  Category2,
  Category3,
  Category4
FROM joined
PIVOT (
  MAX(FileName) FOR CategoryName IN (Category1, Category2, Category3, Category4)
) p
ORDER BY Category1

And the output itself is below:

BrandName          Category1  Category2  Category3  Category4
-----------------  ---------  ---------  ---------  ---------
CoolBrandName      File1      File2      NULL       NULL
AlsoCoolBrandName  File3      NULL       NULL       NULL
CoolBrandName      File4      NULL       NULL       NULL

You can see that the third criterion, the ranking, is not present in the output. It still takes part in the grouping, because it is present in the row set we are applying the PIVOT clause on, the joined one.

And one final note. It is obligatory that the pivoted columns' values be aggregated in a PIVOT query. However, in your case, the data is not supposed to be aggregated, logically, because every single file should be displayed. In such cases one typically uses MAX() or some other aggregate function that is guaranteed not to distort the value of the column the function is being applied on. You only need to make sure every possible group contains no more than one value (which we did, by introducing the ranking column).

Pivoting data from different tables

Answers (1)

Related Questions