Reputation: 17
I have two tables, Table A contains an attribute (ID1) and Time (date/time format). Table B contains an attribute (ID2) and two time columns Time_start and Time_end (both date/time) format.
I need to create a column in Table A that fetches ID2 from Table B where ID1=ID2 and Time is between Time_start and Time_end.
I have tried the following but due to the quantity of data (5 million rows in TableA, 100 thousand in TableB) I run out of RAM:
Column = CALCULATE(FIRSTNONBLANK(TableB[ID2],1),FILTER(TableB,TableB[ID2]='TableA'[ID1]),FILTER(TableB,TableB[Time_start]<='TableA'[Time]),FILTER(TableB,TableB[Time_end]>='TableA'[Time]))
I have already applied a date filter in Power Query and removed columns I don't need. Is there a more efficient way of doing this in DAX or Power Query, unfortunately joining the tables in prior to date load in SQL is not an option.
Upvotes: 0
Views: 1838
Reputation: 2052
With the sizes you are dealing with, you should do this in Power Query, not DAX. Merge the 2 tables, then expand the tables to get Time_start and Time_end. Then add ID2 with a formula like:
try if [Time_start] <= [Time] and [Time] <= [Time_end] then [ID1] else null otherwise null)
The try
and otherwise
will handle nulls in Time_start and Time_end, which would otherwise cause an error.
Here are my sample queries:
// Table A
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTLUN9Q3MlIwtDIyVorViVZywiLmjEXMFV0sFgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID1 = _t, Time = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID1", type text}, {"Time", type datetime}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"ID1"}, #"Table B", {"ID2"}, "Table B", JoinKind.LeftOuter),
#"Expanded Table B" = Table.ExpandTableColumn(#"Merged Queries", "Table B", {"Time_start", "Time_end"}, {"Time_start", "Time_end"}),
#"Added Custom" = Table.AddColumn(#"Expanded Table B", "ID2", each try if [Time_start] <= [Time] and [Time] <= [Time_end] then [ID1] else null otherwise null),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Time_start", "Time_end"})
in
#"Removed Columns"
// Table B
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTI00jc21DcyVDC0MjAA8cFcIwg3VidayQksiBCD84xgKpyBYkYoKozQVLhgmIGsPjYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID2 = _t, Time_start = _t, Time_end = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID2", type text}, {"Time_start", type datetime}, {"Time_end", type datetime}})
in
#"Changed Type"
Upvotes: 1