user823
user823

Reputation: 265

power query (powerbi) data structure: creating a crosssection-temporal data structure (panel data)

I am working in powerbi and I have a problem in the power query editor. My source table contains cross-sectional data with a primary key.

primary_key  data_feature_A
    1             A
    2             B
    3             A
    4             C

I would like to efficiently transform my data into the following structure:

primary_key_1   time_year  data_feature_A
    1             2019         A
    1             2020         A
    1             2021         A
    1             2022         A
    2             2019         B
    2             2020         B
    2             2021         B
    2             2022         B
    3             2019         A
    3             2020         A
    3             2021         A
    3             2022         A
    4             2019         C
    4             2020         C
    4             2021         C
    4             2022         C

I would like to run such a procedure on a very big data set (original data). Each row at the end of the procedure should have as its identifiers both a cross-sectional identifier and a time marker (year).

Upvotes: 1

Views: 83

Answers (2)

Marcus
Marcus

Reputation: 4015

Here is a PowerQuery that expands all the rows with years from 2019 to current year + 1:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXJUitWJVjICspzALGMgyxnMMgGyXJRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [primary_key = _t, data_feature_a = _t]),
    #"Added Custom" = Table.AddColumn(Source, "time_year", each List.Numbers(2019, Date.Year(DateTime.LocalNow() ) + 2 - 2019)),
    #"Expanded time_year" = Table.ExpandListColumn(#"Added Custom", "time_year")
in
    #"Expanded time_year"

Upvotes: 2

user823
user823

Reputation: 265

I appended my table multiple times to itself through in power query using the SQL command:

SELECT * FROM table
UNION ALL
SELECT * FROM table
UNION ALL
SELECT * FROM table
UNION ALL
SELECT * FROM table

In the next step, I added an index column (starting with 0). Then in a final step, I added a custom column with the formula

2019+Number.Round([Index]/(List.Max(#"table"[Index])+1)/4,0)

This yields the time_year column.

Upvotes: 0

Related Questions