Simon O'Doherty
Simon O'Doherty

Reputation: 9359

Is it possible to create the data object in Python for SPSS

I have a python script that is reading in an XML file into an array (in a CSV format I created). I'd like to be able to use that data directly instead of saving to a file.

Is this possible? So it would be like creating a Var.File node but instead of pointing to a file it is taking the data I have already pulled in.

Eg. data[0] = "1,A,B,C" # single line of all documents.

Upvotes: 4

Views: 425

Answers (1)

Andy W
Andy W

Reputation: 5089

In a nutshell, you can paste your Python program in between BEGIN PROGRAM and END PROGRAM blocks directly within an SPSS syntax file. Then you can define an SPSS dataset and append cases to that dataset with the Python code block.

What is potentially nice about this is that it can be done line by line, so can process quite large files in theory. Even with tiny files it should be faster than the write and read the csv files. Example below taken from a blog post I wrote on the subject:

BEGIN PROGRAM Python.
import spss

MyData = [(1,2,'A'),(4,5,'B'),(7,8,'C')] #make a list of lists for the data

spss.StartDataStep()                   #start the data setp
MyDatasetObj = spss.Dataset(name=None) #define the data object
MyDatasetObj.varlist.append('X1',0)    #add in 3 variables
MyDatasetObj.varlist.append('X2',0)
MyDatasetObj.varlist.append('X3',1)
for i in MyData:                       #add cases in a loop
  MyDatasetObj.cases.append(i)
spss.EndDataStep()
END PROGRAM.

Upvotes: 3

Related Questions