Blaze
Blaze

Reputation: 11

Beginner to Spotfire, How to utilise the Python Data Function?

Spotfire beginner here. I have been trying to use Python Data Function to add a calculated column to my table. The Python code when run from the IDE is working, but when run inside the Python Data Function, its throwing an error. I read that the column from Spotfire to Python is mapped as a Pandas Series data, but I`m not sure how to make use of this information and make the necessary modifications so that the code runs inside Spotfire. I would like some guidance or if possible point me in the right direction.

Using Spotfire Analyst 10.10. Input: A Column from my table of datatype Integer. Output: Should be a Column of datatype String.

Tried to iterate over the values of the column with the below code.

Code:
    def affordability(value):
        if value>1000.00:
            print ("Price is more than 1000")
        elif 500.00<=value<=999.99:
            print ("Price is in Mid range")
        elif 0.00<=value<=499.99:
            print ("Price is affordable")       
    #End of the user defined funcion
    
    inter=[]
    doub_profit=profit*2          #profit is the input variable

    for i in doub_profit:
        situation=affordability(i)
        inter.append(situation)
    #End of for loop

    end_res=inter               #end_res is the output variable

Error Message:

Error executing Python script:

    spotfire.sbdf.SBDFError: cannot determine type for list; all values are missing
    
    Traceback (most recent call last):
      File "data_function.py", line 366, in _write_outputs
        output.write(self.globals, self.debug)
      File "data_function.py", line 149, in write
        sbdf.export_data(globals_dict[self.name], self.file, default_column_name=self.name)
      File "sbdf.py", line 163, in export_data
        columns, column_names, column_types, table_metadata, column_metadata = _export_columnize_data(obj,
      File "sbdf.py", line 247, in _export_columnize_data
        column_types = {default_column_name: _ValueTypeId.infer_from_type(list(obj), "list")}
      File "sbdf.py", line 986, in infer_from_type
        raise SBDFError("cannot determine type for %s; all values are missing" % value_description)
Standard Ouput: 
The Output I want is present here.

Debug log:
debug: start evaluate
debug: reading 1 input variables
debug: assigning column 'profit' from file tmpdir\dfpythondf_artifact_input_sgoua4gzlbjtmp.sbdf
debug: read 9426 rows 1 columns
debug: table metadata: 
 {}

debug: column metadata: 
 {'Profit': {}}

debug: done reading 1 input variables
debug: executing script
debug: --- script ---
def affordability(value):
    if value>1000.00:
        print ("Price is more than 1000")
    elif 500.00<=value<=999.99:
        print ("Price is in Mid range")
    elif 0.00<=value<=499.99:
        print ("Price is affordable")       
#End of the user defined funcion

inter=[]
doub_profit=profit*2
for i in doub_profit:
    situation=affordability(i)
    inter.append(situation)
#End of for loop

end_res=inter

debug: --- script ---
debug: analytic_type is 'script'
debug: done executing script
debug: writing 1 output variables
debug: returning 'end_res' as file tmpdir\dfpythondf_artifact_output_d5cbcjvjbsktmp.sbdf
debug: done writing 1 output variables
    
       at Spotfire.Dxp.Data.DataFunctions.Executors.LocalPythonFunctionClient.<RunFunction>d__8.MoveNext()
       at Spotfire.Dxp.Data.DataFunctions.Executors.PythonScriptExecutor.<ExecuteFunction>d__11.MoveNext()
       at Spotfire.Dxp.Data.DataFunctions.DataFunctionExecutorService.<ExecuteFunction>d__8.MoveNext()

Note: This is my first post in stackoverflow, any lead to reference material that has code samples on this topic (Spotfire Data Functions) is really appreciated.

Upvotes: 1

Views: 5887

Answers (1)

Colin
Colin

Reputation: 45

This error happens when the value/list/pandas data frame has no values in it. That means Spotfire can't determine the type of the data being return i.e. integer, string etc. to create the necessary column or table as an output. The two things I would try is:

  1. Check end_res values by adding a print(end_res) to the end of your script. This will then appear in the error message, so you can see if it is populated or not. Adding print statements is a quick way to do some checks if you are getting errors.

  2. I normally would convert all lists to a pandas data frame before returning. I don't know if it being a list is part of the problem here but I would try something like:

    end_res = pd.DataFrame(inter, columns=['Affordability'])

For your example above, you may actually just be able use a CASE statement in Spotfire without needing to call Python. If you add a calculated column to your data table, via the Data menu and then use a CASE expression, you can easily replicate the behaviour in your function above. Here is a good quick guide on calculated columns:

https://www.youtube.com/watch?v=0GpB1F4XEFI

There are quite a few good YouTube Dr Spotfire sessions on Python also such as: https://www.youtube.com/watch?v=MXhO3yG6dUg and https://www.youtube.com/watch?v=EDOIZACQmhw which may help.

I also wrote some examples on doing tasks such as correlation analysis and variable importance here: https://community.tibco.com/wiki/performing-correlation-analysis-using-python-data-functions-tibco-spotfire

Worth following Dr Spotfire on YouTube. There is also the Spotfire Enablement Hub here: https://community.tibco.com/wiki/spotfire-enablement-hub

Upvotes: 2

Related Questions