Calculations within a spotfire column

Question

My firm decided on spotfire and then appointed me spotfire guru (cos I was there) and now I am figuring it out. We work with really large data sets (the one i'm asking about a couple of million rows). Anyway, the column is Z data and the difference between each sequential cell will show how far the machine it represents has moved. Eg:

Name    Time Stamp         X      Y     Lat     Long      Z         Delta Z
Name    28.3.2018 10:59 0,02438 0,02888 60,49   26,96   0,037794693 
Name    28.3.2018 10:59 0,02671 0,03768 60,49   26,96   0,046186649 0,00839
Name    28.3.2018 10:59 0,02409 0,0294  60,49   26,96   0,038009053 0,00818
Name    28.3.2018 11:00 0,02676 0,03768 60,49   26,96   0,046215582 0,00821
Name    28.3.2018 11:00 0,02393 0,02941 60,49   26,96   0,037915604 0,00830
Name    28.3.2018 11:00 0,02669 0,03761 60,49   26,96   0,046117981 0,00820
Name    28.3.2018 11:00 0,02341 0,02966 60,49   26,96   0,037785496 0,00833
Name    28.3.2018 11:00 0,02673 0,03758 60,49   26,96   0,046116692 0,00833
Name    28.3.2018 11:00 0,02329 0,0297  60,49   26,96   0,037742736 0,00837
Name    28.3.2018 11:00 0,02205 0,0306  60,49   26,96   0,037716873 0,00003

So, above is a dump of a few rows from the output that I have taken from excel. I run a python script on the source data (which is JSON) and it outputs the above except for the last two columns which i need to calculate. I can use spotfire to make the Z column (that's simple Pythagoras as X and Y are along and up from the reference point) but what I need is the change in Z (delta Z) through the day. In excel its easy as the formula is "=ABS(G3-G2)" and then paste it along the whole column it becomes "=ABS(G4-G3)", "=ABS(G5-G4)" and so on. I can't make it in excel as the file is too big.

The formula doesn't take the very first Z as a fixed, anchor point, it uses each one along. The data then lets me see how far the machine has moved in a certain period.

It is this that I can't solve in spotfire. All help appreciated.

niko · Accepted Answer

UPDATE: thanks for including the timestamp column and clarifications. I still needed to use the [Row] column I created because I wanted to make sure things go in the correct order, and the timestamp isn't granular enough to ensure that. if you have a timestamp with seconds or milliseconds in your actual dataset, I suggest to use that over a [Row].

that said, I don't see too big a difference from the original results, and I think my answer below still almost completely works. the biggest difference is that the blank row for [Delta Z] is at the top of the data set instead of the bottom. I've accounted for this by changing the expression to:

Abs([Z] - First([Z]) OVER Previous([Row]))

here's the resulting table. [Delta Z] is the results column you posted above and [DZ_1] is my new column:

DeltaZ    Z              Row  DZ_1
          0.037794693    1    
0.00839   0.046186649    2    0.008391956
0.00818   0.038009053    3    0.008177596
0.00821   0.046215582    4    0.008206529
0.0083    0.037915604    5    0.008299978
0.0082    0.046117981    6    0.008202377
0.00833   0.037785496    7    0.008332485
0.00833   0.046116692    8    0.008331196
0.00837   0.037742736    9    0.008373956
0.00003   0.037716873    10   2.5863000000001E-05

as an aside, you can adjust the number of decimals shown to whatever you like by going to Edit»Column Properties, selecting the column in question, choosing the Formatting tab, and finally setting the Decimals dropdown as desired.

first, welcome to StackOverflow. please in the future be prepared to provide a complete Minimally Complete, Verifiable Example. in terms of Spotfire, that means a sample dataset (in text) that I can copy-paste into Spotfire, including a column showing your expected results. you can create this in Excel or Notepad. please understand that I'm taking time out of my day to help you with your problem, and request that you are compelled to make it as simple as possible for me to do so.

second, welcome to Spotfire! I learned the same way as you. I strongly recommend asking your employer to pay for the TIBCO Spotfire online courses as they will provide a great base of understanding for using the tool.

with that out of the way, I've made the following assumptions about your dataset since you've not fully answered my questions about your dataset. if my assumptions are incorrect, please answer my questions about your dataset.

there is no column indicating some kind of order, such as a timestamp or row number
you do not expect any result in the final row of the dataset

to satisfy your requirements, first I needed to create a column that removes assumption #1 above. I've called this column [Row] and its expression is simply:

RowId()

this will output the literal row number for that row (as opposed to the BaseRowId() function, which shows the visual row number, after any marking and filters are applied).

I created this because in order to compare rows against one another, Spotfire requires some kind of indicator as to which row comes before the next one.

then I created a second column, [Delta Z] with the following expression:

Abs([Z] - First([Z]) OVER Next([Row]))

in other words, "for each row, take the current value of [Z] for that row and subtract it from the first value of [Z] found over all of the following rows (i.e., the next row)."

this produces the following:

Z          Row     Delta Z
0.24157    1       0.03424
0.27581    2       0.03195
0.24386    3       0.000149999999999983
0.24371    4

you can hide [Row] in any table visualization through the Properties dialog for that visualization, but you cannot delete it completely.

Calculations within a spotfire column

Answers (1)

Related Questions