Reputation: 25
An update: I was trying to do this with Power Query import into Power bi. I didn't realize how easy it would have been to do with DAX. Thanks for the answers that were specific to Power Query Like I originally specified.
I am trying to add a column to an import in power query to import to power bi that shows the next time an order happens, to let me see if a re-order was made / known at that point in time.
Sample data
Orders Table
Item | OrderDate |
---|---|
Red | 1/1/2024 |
Blue | 1/1/2024 |
Yellow | 1/1/2024 |
Red | 2/1/2024 |
Yellow | 2/1/2024 |
Red | 3/1/2024 |
Yellow | 3/1/2024 |
The result table would show the next date an order happens, for each order, with the join being on the "Item" and the date being the first record greater than the current row date.
Result
Result | ||
---|---|---|
Item | OrderDate | Next Order Date |
Red | 1/1/2024 | 2/1/2024 |
Blue | 1/1/2024 | null |
Yellow | 1/1/2024 | 2/1/2024 |
Red | 2/1/2024 | 3/1/2024 |
Yellow | 2/1/2024 | 3/1/2024 |
Red | 3/1/2024 | null |
Yellow | 3/1/2024 | null |
I have accomplished this via merged queries using the transform button and then conditional columns to narrow down the result (creating a table with a full outer join), but given the size of the dataset this join creates far too many rows in the import that are then just removed from the dataset.
I would like to use a column formula for selectrows, which I think would involve table.selectrows (table.group), but also needs to join on the item and the next order date being greater than the order data in each row.
I see lots of information on selectrows, or group by, but not on an efficent way to pass this back to the mysql database.
if I were writing the mysql statement it would be
select
a.item
,a.orderdate
,min(nextorder.orderdate) as NextOrderDate
from orders a
left join orders as NextOrder on
a.item=NextOrder.item
and
NextOrder.orderdate>a.orderdate
group by
a.item
,a.orderdate`
Upvotes: 0
Views: 82
Reputation: 101
DAX measure is usually an elegant way to address such questions,
Next =
SELECTCOLUMNS(
OFFSET( 1, ORDERBY( Orders[OrderDate] ), PARTITIONBY( Orders[Item] ) ),
Orders[OrderDate]
)
Upvotes: 0
Reputation: 25
Thank you horseyride. i was able to make your solution work.
As i continued googling, I also came across the "earlier" DAX function, which allows me to add a column easily. It is not done in the import like I asked, but does what i wanted by keeping the amount of data imported to the original single table set.
EARLIER(, )
the formula I used is this:
NextOrderDate =
CALCULATE(
MIN(Orders[OrderDate]),
FILTER(
Orders,
Orders[Item] = EARLIER(Orders[Item]) &&
Orders[OrderDate] > EARLIER(Orders[OrderDate])
)
)
Upvotes: 0
Reputation: 60174
You may find the following PQ code executes more rapidly:
let
//Replace next line with your actual data source
Source = Table,
next = Table.AddColumn(Source, "Next Order Date",
(r)=>Table.SelectRows(Source, each
[Item]=r[Item]
and [OrderDate]>r[OrderDate])[OrderDate]{0}?,
type nullable date)
in
next
Upvotes: 0
Reputation: 21318
In powerquery you could try
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1, Int64.Type), //to preserve sort order; not necessary
#"Grouped Rows" = Table.Group(#"Added Index", {"Item"}, {{"data", each
let zz=_,
shiftedList = List.RemoveFirstN(Table.Column(zz,"OrderDate"),1)
in Table.FromColumns(Table.ToColumns(zz) & {shiftedList},Table.ColumnNames(zz) & {"Previous Date"}) , type table }}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"OrderDate", "Index", "Previous Date"}, {"OrderDate", "Index", "Previous Date"}),
#"Sorted Rows" = Table.Sort(#"Expanded data",{{"Index", Order.Ascending}}), //to preserve sort order; not necessary
#"Removed Columns" = Table.RemoveColumns(#"Sorted Rows",{"Index"}) //to preserve sort order; not necessary
in #"Removed Columns" //to preserve sort order; not necessary
Upvotes: 0