bogdanCsn
bogdanCsn

Reputation: 1325

Export pandas dataframe to SAS sas7bdat format

The flow I have in mind in this:
1. Export a sas7bdat from SAS
2. Import that file in python with pd.read_sas and do some stuff on in
3. Export the pandas dataframe to sas7bdat (or some other SAS binary fileformat). I thought that pd.to_sas would exist, but it doesn't
4. Open the new file in SAS and do further stuff on it

Is there a solution to point 3 above? As I see it, my only options are csv or some SQL database.
This is not really a programming question. hope it won't be an issue.

Upvotes: 12

Views: 31851

Answers (2)

Joe
Joe

Reputation: 63424

Python is capable of writing to SAS .xpt format (see for example the xport library), which is SAS's open file format. SAS7BDAT is a closed file format, and not intended to be read/written to by other languages; some have reverse engineered enough of it to read at least, but from what I've seen no good SAS7BDAT writer exists (R has haven, for example, which is the best one I've seen, but it still has issues and things it can't do).

More common than XPT files, though, which can be slow to work with, is to write a CSV and then write a SAS input script in your python/etc. program. That allows you to use variable labels, value labels, types, etc., as you wish very easily; and writing a SAS input script is very easy to do. Many other software packages do this for their preferred method to produce SAS files. This has an additional advantage that it is easily cross-platform - doesn't matter if your SAS program is on a mainframe, UNIX, Windows, etc.; it's all the same.

Edit: If you do have SAS licensed locally, either via a server or local install, another option for exporting Python data to SAS is SASPy, which is a SAS-maintained open source project that allows Python to directly connect to SAS instances and directly send data. (Under the hood, I believe the data is actually transmitted as a CSV most of the time, and then read in using SAS code.) The SAS ODBC driver is also an option, but for Python SASPy will be the easiest option most likely.

Upvotes: 17

David A. West
David A. West

Reputation: 37

"SAS7BDAT is a closed file format, and not intended to be read/written to by other languages; some have reverse engineered enough of it to read at least, but from what I've seen no good SAS7BDAT writer exists."

Although the SAS7BDAT is a proprietary format, it is not closed. It can be read and written by third-party products using SAS' own ODBC drivers. https://support.sas.com/en/software/sas-odbc-drivers.html. Since Python can use ODBC (pyodbc), just use the SAS ODBC Driver to write the SAS7BDAT file format.

IBM SPSS Statistics and IBM SPSS Modeler can also read and write the SAS7BDAT format as well as the earlier pre-version 7 formats and the SAS Transport File format (the .xpt) files noted above. These products do not require ODBC to do this and this capability is included in SPSS Statistics Base via the SAVE Translate command. It is included in SPSS Modeler Professional via the SAS Source node for reading and the SAS Export node for writing.

Upvotes: 2

Related Questions