codingknob
codingknob

Reputation: 11670

Complications building result using group by given data set

Please see SQL Fiddle here for usable example: http://sqlfiddle.com/#!3/6d04f/1

I have 4 tables, TableA, TableB, TableC and TableD.

TableA data and structure:

+--------------------+
| ID | Name          |
+--------------------+
| 987 | Mississauga  |
| 454 | Hoboken      |
| 343 | Berkeley     |
+--------------------+

TableB data and structure:

+------------------------------------------------------------+
| City1ID | City1Name     | City2ID | City2Name     | Factor |
+------------------------------------------------------------+
| 343     | Berkeley      | 19      | Oakland       | 0.5    |
| 987     | Mississauga   | 23      | Toronto       | 1.0    |
| 66      | Redmond       | 13      | Seattle       | 1.0    |
| 343     | Berkeley      | 14      | San Jose      | 0.5    |
| 454     | Hoboken       | 55      | New York City | 0.9    |
| 454     | Hoboken       | 44      | Philadelphia  | 0.1    
| 19      | Oakland       | 9       | San Francisco | 1.0    |
...
+------------------------------------------------------------+

TableC: is a table that has hourly data for cities for each day of the year. Not all cities are in TableC. TableA represents a small subset of the cities for which a mapping exists in TableB. In my case, TableA has 439 cities (rows), and TableB has 7000 rows of mappings.

The structure of TableC is as follows:

+------------------------------------------------------------------------+
|Date | CID | Name | Blah1 | Blah2 | Hour1 | Hour2 | Hour3 | ... | Hour24 |
+------------------------------------------------------------------------+

Here is an example of one day's worth of data for TableC:

2006-08-01 00:00:00 9 San Francisco Blah1 Blah2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2006-08-01 00:00:00 23 Toronto Blah1 Blah2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
2006-08-01 00:00:00 13 Seattle Blah1 Blah2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
2006-08-01 00:00:00 14 San Jose Blah1 Blah2 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
2006-08-01 00:00:00 55 New York City Blah1 Blah2 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
2006-08-01 00:00:00 44 Philadelphia Blah1 Blah2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6

For illustration, the city of Oakland is NOT present in TableC. However, hourly observation data for Berkeley is derived by summing ( 0.5 x data for San Jose AND 0.5 x data for Oakland).

TableD structure:

+----------------------------------+
| Date | ID | Name |  Hour | Value |
+----------------------------------+

TableD is the end result, which should be populated with hourly observation data for each day for each of the cities in TableA.

In my example HOUR 1 should look like the following, with Mississauga = 2.0 (1 x Toronto = 1 x 2 = 2), Hoboken = 5.10 (0.9 x New York City + 0.1 x Philadelphia = 0.9 x 5 + 0.1 x 6 = 5.10), and Berkeley = 2.50 (0.5 x San Jose + 0.5 x Oakland = 0.5 x 4 + 0.5 x San Francisco = 0.5 x 4 + 0.5 x 1 = 2.50):

+--------------------------------------------------------+
| Date                | ID  |     Name    | Hour | Value |
+--------------------------------------------------------+
| 2006-08-01 00:00:00 | 987 | Mississauga |  1   | 2.00  | 
| 2006-08-01 00:00:00 | 454 | Hoboken     |  1   | 5.10  |
| 2006-08-01 00:00:00 | 343 | Berkeley    |  1   | 2.50  | 
+--------------------------------------------------------+

Now, the tricky part is Oakland is not present in TableC and instead it is mapped to San Francisco so any city that is mapped to Oakland needs to refer to the city that it is mapped to. In my situation, there are many situations where this occurs so I want to make sure I handle this properly.

I got as far as unpacking/normalizing TableC and joining it with Table B but not sure how to proceed. I know I need to use Group By but not sure how to do so with the caveat that I have described.

select * from
(
    select Date, CID, Name, replace(Details, 'Hour', '') as Hour, ObservationValue
    from TableC
    as Result
    unpivot
    (
      ObservationValue for Details in ([Hour1], [Hour2], [Hour3], [Hour4], [Hour5], [Hour6],
      [Hour7],[Hour8],[Hour9],[Hour10],[Hour11],[Hour12],[Hour13],[Hour14],
      [Hour15],[Hour16],[Hour17],[Hour18],[Hour19],[Hour20],[Hour21],[Hour22],
      [Hour23],[Hour24])
)
as UnPvt
) as resultX
join TableB as b on (b.City2ID = resultX.CID)

If you can't access the SQL Fiddle link below is the code to create the Schema:

CREATE TABLE TableA
    ([Id] int, [name] varchar(20));

INSERT INTO TableA
    ([Id], [name])
VALUES
    (987, 'Mississauga'),
    (454, 'Hoboken'),
    (343, 'Berkeley');

CREATE TABLE TableB
    ([City1ID] int, [City1Name] varchar(20), 
     [City2ID] int, [City2Name] varchar(20), [Factor] varchar(20))

INSERT INTO TableB
     ([City1ID], [City1Name],[City2ID], [City2Name], [Factor])
VALUES
     (343, 'Berkeley', 19, 'Oakland', 0.5),
    (987, 'Mississauga', 23, 'Toronto', 1.0),
    (66, 'Redmond', 13, 'Seattle', 1.0),
    (343, 'Berkeley', 14, 'San Jose', 0.5),
    (454, 'Hoboken', 55, 'New York City', 0.9),
    (454, 'Hoboken', 44, 'Philadephia', 0.1),
    (19, 'Oakland', 9, 'San Francisco', 1.0);

CREATE TABLE TableC
    ([date] datetime, [CId] int, [name] varchar(50), [blah1] varchar(10), 
     [blah2] varchar(10), [hour1] int, [hour2] int, [hour3] int, [hour4] int,
    [hour5] int, [hour6] int, [hour7] int, [hour8] int, [hour9] int,
    [hour10] int,[hour11] int, [hour12] int, [hour13] int, [hour14] int,
    [hour15] int, [hour16] int, [hour17] int, [hour18] int, [hour19] int,
    [hour20] int, [hour21] int, [hour22] int, [hour23] int, [hour24] int);

INSERT INTO TableC
    ([date], [CId], [name], [blah1], 
     [blah2], [hour1], [hour2], [hour3], [hour4],
    [hour5], [hour6], [hour7], [hour8], [hour9],
    [hour10],[hour11], [hour12], [hour13], [hour14],
    [hour15], [hour16], [hour17], [hour18], [hour19],
    [hour20], [hour21], [hour22], [hour23], [hour24])

VALUES
('2006-08-01 00:00:00', 9, 'San Francisco', 'Blah1', 'Blah2', 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1),
('2006-08-01 00:00:00', 23,'Toronto', 'Blah1', 'Blah2', 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2),
('2006-08-01 00:00:00', 13,'Seattle', 'Blah1', 'Blah2', 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3),
('2006-08-01 00:00:00', 14,'San Jose', 'Blah1', 'Blah2', 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4),
('2006-08-01 00:00:00', 55,'New York City', 'Blah1', 'Blah2', 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5),
('2006-08-01 00:00:00', 44,'Philadelphia', 'Blah1', 'Blah2', 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6);

Upvotes: 0

Views: 144

Answers (1)

Sankara
Sankara

Reputation: 1479

this will do with your data and structure

   select       date,cityid, cityname,hr, sum(hour1) as total
from  
            (
                select  c.Date as date,
                        isnull(a.id,parent.city1id) as cityid,
                        isnull(a.name,parent.city1name) as cityname,
                        isnull(parent.factor,1) *            --- parent factor
                        b.factor *  --- actual factor
                        isnull(c.hour1,0) as hour1 ,
                        1 as hr

                from    tablea a 
                right outer join tableb b on b.city1id = a.[id]
                left  outer join tableb parent on parent.city2id=b.city1id
                left  outer join tablec c on c.cid = isnull(b.city2id,b.city1id)
            )   final 
where       date is not null and cityid is not null
group by    cityid,cityname,date,hr

Upvotes: 1

Related Questions