Reputation: 177
I'm trying to add a calculated column here for certain groups of values as they are identified as duplicate values, but it is more trickier than the unique value identifier.
So my data table looks like this: AS-IS
Material MvtType Date Order No. Quantity RowID
10000001 101 01/01/19 123452 1.0 1
10000001 101 03/01/19 123456 5.0 5
10000001 102 04/01/19 123456 -5.0 7
10000001 101 07/01/19 123456 7.0 8
10000001 101 01/02/19 123453 10.0 9
10000002 101 10/01/19 222221 100.0 44
10000002 101 11/01/19 222221 150.0 43
10000002 102 11/01/19 222221 -100.0 41
10000002 102 13/01/19 222221 -150.0 42
10000002 101 03/02/19 222221 200.0 50
10000002 101 06/02/19 222221 300.0 51
There are two types of Movement type: 101=receiving a Material to stock and 102= (returning due to error). Then usually very soon a material is 101 = received to stock with corrected values (usually quantity or correct order number). Therefore, previous two items are duplicates and have to be ignored and I'm trying to add a calculated column that will show the rows that are "duplicated" and needs to be deleted.
I've tried to add column that only shows all mvt type = 102 (smth like: If([MvtType]=102, "delete", "ok")
), since anyway they should be ignored in the end, but struggling to write an expression that will include the last row that has the same Material code, Date that is equal or earlier than 102 row, same Order no. and same quantity (different sign [+]), but with different movement type, i.e. 101 movement type.
TO-BE:
Material MvtType Date Order No. Quantity RowID Calculated Column
10000001 101 01/01/19 123452 1.0 1 OK
10000001 101 03/01/19 123456 5.0 5 delete
10000001 102 04/01/19 123456 -5.0 7 delete
10000001 101 07/01/19 123456 7.0 8 OK
10000001 101 01/02/19 123453 10.0 9 OK
10000002 101 10/01/19 222221 100.0 44 delete
10000002 101 11/01/19 222221 150.0 43 delete
10000002 102 11/01/19 222221 -100.0 41 delete
10000002 102 13/01/19 222221 -150.0 42 delete
10000002 101 03/02/19 222221 200.0 50 OK
10000002 101 06/02/19 222221 300.0 51 OK
P.S.: RowID is the order of rows I got from source data and it can be altered I believe if custom sorting is applied as the data is in Excel. The order of data in example data is changed based on sorting order by Material→Date→Quantity in Spotfire, hence the different row order in table than source data.
Any ideas how to add calculated column?
Upvotes: 2
Views: 1167
Reputation: 1
If you go into your data canvas and add transformation "filter data"
If(Rank(Baserowid(),"asc",[myGroup1])=1,TRUE, FALSE)
it will filter all your duplicates out
Upvotes: 0
Reputation: 114
You can do this with a data function. This blog post -- https://datashoptalk.com/data-function-properties-part5/ -- explains the basics of TERR data functions. To do so, it uses a data function that identifies duplicate records. Scroll down to where it says...
"With that said, here is a very simple script that I’ll reference throughout the post. This single line of R code that will check for duplicates in a data set using two identifiers that define the granularity of the table. The output is a column called duplicate. duplicate <- duplicated(data.frame(key1, key2))"
So, it's one line of code that identifies duplicate records based on two columns of data. You could easily modify for as many as you needed.
Upvotes: 0
Reputation: 758
You can identify duplicate rows with an expression and use as many or as little columns to check. Here is a great Article
If(Rank(Baserowid(),"asc",[myGroup1])=1,"First row of group","Duplicate")
If(Rank(Baserowid(),"asc",[myGroup1],[myGroup2],[myGroup3])=1,"First row of group","Duplicate")
Upvotes: 0
Reputation: 1452
not sure this logic is watertight, but at the price of first calculating a column called AbsQuantity as Abs([Quantity]) then you could define your flag as
If(Count([RowID]) OVER (Intersect([Material],[Order No.],[AbsQuantity]))=2,"delete","OK")
assuming that if there are two rows with the same Material, Order No. and absolute value of Quantity, they are meant to cancel each other. The reason for calculating the AbsQuantity column is that OVER does not support expressions.
Upvotes: 1