pkExec
pkExec

Reputation: 2076

Create a view based on column metadata

Let's assume two tables: TableA holds various data measurements from a variety of stations. TableB holds metadata, about the columns used in TableA.

TableA has:

stationID int not null, pk
entryDate datetime not null, pk
waterTemp float null,
waterLevel float null ...etc

TableB has:

id int not null, pk, autoincrement
colname varchar(50),
unit varchar(50) ....etc

So for example, one line of data from tableA reads:

1 | 2013-01-01 00:00 | 2.4 | 3.5

two lines from tableB read:

1| waterTemp | celcius
2| waterLevel | meters

This is a simplified example. In truth, tableA might hold close to 20 different data columns, and table b has close to 10 metadata columns.

I am trying to design a view which will output the results like this:

StationID |      entryDate   | water temperature |  water level |
    1     | 2013-01-01 00:00 |     2.4 celcius   |   3.5 meters |

So two questions:

  1. Other than specifying subselects from TableB (..."where colname='XXX'") for each column, which seems horribly insufficient (not to mention...manual :P ), is there a way to get the result I mentioned earlier with automatic match on colname?
  2. I have a hunch that this might be bad design on the database. Is it so? If yes, what would be a more optimal design? (Bear in mind the complexity of the data structure I mentioned earlier)

Upvotes: 0

Views: 656

Answers (4)

Anup Shah
Anup Shah

Reputation: 1254

dynamic SQL with PIVOT is the answer. though it is dirty in terms of debugging or say for some new developer to understand the code but it will give you the result you expected.

check the below query.

in this we need to prepare two things dynamically. one is list columns in the result set and second is list of values will appear in PIVOT query. notice in the result i do not have NULL values for Column3, Column5 and Column6.

    SET NOCOUNT ON
    IF OBJECT_ID('TableA','u') IS NOT NULL
        DROP TABLE TableA
    GO
    CREATE TABLE TableA
    (
        stationID int not null IDENTITY (1,1)
        ,entryDate datetime not null
        ,waterTemp float null
        ,waterLevel float NULL
        ,Column3    INT NULL
        ,Column4    BIGINT NULL
        ,Column5    FLOAT NULL
        ,Column6    FLOAT NULL
    )
    GO

    IF OBJECT_ID('TableB','u') IS NOT NULL
        DROP TABLE TableB
    GO
    CREATE TABLE TableB
    (
        id int not null IDENTITY(1,1)
        ,colname varchar(50) NOT NULL
        ,unit varchar(50) NOT NULL
    )
    INSERT INTO TableA( entryDate ,waterTemp ,waterLevel,Column4)
    SELECT '2013-01-01',2.4,3.5,101
    INSERT INTO TableB( colname, unit )
    SELECT 'WaterTemp','celcius'
    UNION ALL SELECT 'waterLevel','meters'
    UNION ALL SELECT 'Column3','unit3'
    UNION ALL SELECT 'Column4','unit4'
    UNION ALL SELECT 'Column5','unit5'
    UNION ALL SELECT 'Column6','unit6'

    DECLARE @pvtInColumnList NVARCHAR(4000)=''
            ,@SelectColumnist NVARCHAR(4000)=''
            , @SQL nvarchar(MAX)=''


    ----getting the list of Columnnames will be used in PIVOT query list
    SELECT @pvtInColumnList = CASE WHEN @pvtInColumnList=N'' THEN N'' ELSE @pvtInColumnList + N',' END
                                + N'['+ colname + N']'
    FROM TableB
    --PRINT @pvtInColumnList


    ----lt and rt are table aliases  used in subsequent join.
    SELECT @SelectColumnist= CASE WHEN @SelectColumnist = N'' THEN N'' ELSE @SelectColumnist + N',' END
                            + N'CAST(lt.'+sc.name + N' AS Nvarchar(MAX)) + SPACE(2) + rt.' + sc.name + N' AS ' + sc.name
    FROM sys.objects so
    JOIN sys.columns sc
    ON so.object_id=sc.object_id AND so.name='TableA' AND so.type='u'
    JOIN TableB tbl
    ON tbl.colname=sc.name
    JOIN sys.types st
    ON st.system_type_id=sc.system_type_id
    ORDER BY sc.name

    IF @SelectColumnist <> '' SET @SelectColumnist = N','+@SelectColumnist
    --PRINT @SelectColumnist

    ----preparing the final SQL to be executed
    SELECT @SQL = N'
                    SELECT 
                    --this is a fixed column list
                    lt.stationID
                    ,lt.entryDate
                    '
                    --dynamic column list
                    + @SelectColumnist +N'
                    FROM TableA lt,
                    (
                        SELECT * FROM
                        (
                            SELECT colname,unit
                            FROM TableB
                        )p
                        PIVOT
                        ( MAX(p.unit) FOR p.colname IN ( '+ @pvtInColumnList +N' ) )q
                    )rt
                '
    PRINT @SQL
    EXECUTE sp_executesql @SQL

here is the result

enter image description here

ANSWER to your Second Question. the design above is not even giving performance nor flexibility. if user wants to add new Metadata (Column and Unit) that can not be done w/o changing table definition of TableA. if we are OK with writing Dynamic SQL to give user Flexibility we can redesign the TableA as below. there is nothing to change in TableB. I would convert it in to Key-value pair table. notice that StationID is not any more IDENTITY. instead for given StationID there will be N number of row where N is the number of column supplying the Values for that StationID. with this design, tomorrow if user adds new Column and Unit in TableB it will add just new Row in TableA. no table definition change required.

    SET NOCOUNT ON
    IF OBJECT_ID('TableA_New','u') IS NOT NULL
        DROP TABLE TableA_New
    GO
    CREATE TABLE TableA_New
    (
        rowID           INT NOT NULL IDENTITY (1,1)
        ,stationID      int not null
        ,entryDate      datetime not null
        ,ColumnID       INT
        ,Columnvalue    NVARCHAR(MAX)
    )
    GO

    IF OBJECT_ID('TableB_New','u') IS NOT NULL
        DROP TABLE TableB_New
    GO
    CREATE TABLE TableB_New
    (
        id int not null IDENTITY(1,1)
        ,colname varchar(50) NOT NULL
        ,unit varchar(50) NOT NULL
    )
    GO

    INSERT INTO TableB_New(colname,unit)
    SELECT 'WaterTemp','celcius'
    UNION ALL SELECT 'waterLevel','meters'
    UNION ALL SELECT 'Column3','unit3'
    UNION ALL SELECT 'Column4','unit4'
    UNION ALL SELECT 'Column5','unit5'
    UNION ALL SELECT 'Column6','unit6'

    INSERT INTO TableA_New (stationID,entrydate,ColumnID,Columnvalue)
            SELECT 1,'2013-01-01',1,2.4
    UNION ALL SELECT 1,'2013-01-01',2,3.5
    UNION ALL SELECT 1,'2013-01-01',4,101
    UNION ALL SELECT 2,'2012-01-01',1,3.6
    UNION ALL SELECT 2,'2012-01-01',2,9.9
    UNION ALL SELECT 2,'2012-01-01',4,104

    SELECT * FROM TableA_New
    SELECT * FROM TableB_New


    SELECT * 
    FROM
    (
        SELECT lt.stationID,lt.entryDate,rt.Colname,lt.Columnvalue + SPACE(3) + rt.Unit AS ColValue
        FROM TableA_New lt
        JOIN TableB_new rt
            ON lt.ColumnID=rt.ID
    )t1
    PIVOT
    (MAX(ColValue) FOR Colname IN ([WaterTemp],[waterLevel],[Column1],[Column2],[Column4],[Column5],[Column6]))pvt

see the result below.

enter image description here

Upvotes: 1

Andrew
Andrew

Reputation: 8758

I think you would have to flip it to a row per metric. Looking at your design above:

1 | 2013-01-01 00:00 | 2.4 | 3.5

How do I know what row in table b that applies to?

I would try something like this: Table B:

Metric_Key  |  Metric
1          |  WaterLevel in Meters
2          |  Temp in Celcius

...

Table A:

StationID   | entrydate        | Metric_Key   | Value
1            2013-01-01 00:00      1           2.4

Upvotes: 0

voglster
voglster

Reputation: 833

You might want to look into the MS-SQL function called PIVOT/UNPIVOT

http://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx

you can take column names and have them in rows or vice versa using this command.

Once you have the column name in the column itself you can join that column from tableA to tableB. Then unpivot to get your data back the way you want it. (caveat I may be swapping the use of pivot and unpivot :))

Word to the wise though, if you are working with large tables, pivot is not the fastest of operations.

Upvotes: 0

Daniel Hilgarth
Daniel Hilgarth

Reputation: 174417

I would design this database like the following:

A table MEASUREMENT_DATAPOINT that contains the measured data points. It would have the columns ID, measurement_id, value, unit, name.
One entry would be 1, 1, 2.4, 'celcius', 'water temperature'. A table MEASUREMENTS that contains the data of the measurement itself. Columns: ID, station_ID, entry_date.

Upvotes: 0

Related Questions