Galled
Galled

Reputation: 4206

What is the right way to generate a text file from SQL Server in order to be imported into SAS?

I have a table, in a SQL Server 2008 database, that I want to export into SAS, this table has columns with many datatypes ( integer, varchar, float ) in a normal case this process would be simplier, but I have a funny scenario:

I have two machines: the first one is connected to SQL Server 2008 but it doesn't have the SAS software installed, the second one have the SAS software installed but it's not connected to SQL, therefore I can't use the PROC SQL command, so I try:

  1. Exporting into a CSV with comma separations: If I have a field like the user ID that is generally a number (SAS recognizes the field as numeric) and SAS detects a record with an alphanumeric value, I get an error message.

    I have an example of the column that cause (I guess) the error:

user_ID
1251231253    ok
5645645642    ok
1111232135    ok
CC19491855    fail

NOTE: Invalid data for user_id in line 468 12-21.
RULE: ----+----1----+----2----+----3----+----4..... (and so on)
468 70988585,5,CC19491855 , , .... (blah, blah, data, data...)

cod_001=70988585 tip_001=5 user_id=CC19491855 .... (blah, blah, data, data...)
ERROR=1 N=467

  1. Exporting into a CSV with comma separations and columns with quotes: All is ok, I have no error messages but all the variables are exported as alphanumerical in SAS.

I'm newbie in SAS so if there are some way to format the variables after the export process would be ok.

Upvotes: 2

Views: 440

Answers (2)

Robert Penridge
Robert Penridge

Reputation: 8513

What are you currently using to import the CSV into SAS? Your best bet is to use the File->Import Data Wizard. When you do this it will generate some SAS code in the log window that will look something like this:

data WORK.xx    ;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile 'D:\sasdev\xxx\output\xxx.csv' delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;
   informat datetimestamp anydtdtm40. ;
   informat phoneHome best32. ;
   informat phoneCell best32. ;
   informat phoneBusiness best32. ;
   format datetimestamp datetime. ;
   format phoneHome best12. ;
   format phoneCell best12. ;
   format phoneBusiness best12. ;
input
            datetimestamp
            phoneHome
            phoneCell
            phoneBusiness
;
if _ERROR_ then call symputx('_EFIERR_',1);  /* set ERROR detection macro variable */
run;

If the import wizard did not give you the results you need you can copy the code it generated and modify it as necessary. In my above example all of the phonenumbers were imported as numeric. Below I've cleaned up the code a little and changed the phoneBusiness variable to a character variable with a length of 10.

data WORK.xx    ;
  infile 'D:\sasdev\xxx\output\xxx.csv' delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;

  informat datetimestamp anydtdtm40. ;
  informat phoneHome best32. ;
  informat phoneCell best32. ;
  informat phoneBusiness $10. ;

  format datetimestamp datetime. ;
  format phoneHome best12. ;
  format phoneCell best12. ;
  format phoneBusiness $10. ;

  input datetimestamp
        phoneHome
        phoneCell
        phoneBusiness $
        ;
run;

EDIT/UPDATE:

Some notes on the 'data types'.

There are really only 2 types of data in SAS. Character and Numeric. Dates are stored as numeric data but usually formatted to appear in human readable representations. The types referred to in the below comment are known as formats/informats.

Formats control how data is 'displayed' or 'output'. Informats control how data is parsed when being used as input (such as reading from a file). The informat you choose should represent how the data is stored. The format you choose really just depends on how you want it displayed.

You can find the documentation here:

http://support.sas.com/documentation/cdl/en/leforinforref/63324/HTML/default/viewer.htm#titlepage.htm

A quick summary of the above code - best12. is a numeric format that will use a width of up to 12 characters. The best32. will use a width of up to 32 characters. The $10. is a character format (identified by the $ symbol) with a width of 10 characters.

Upvotes: 1

BellevueBob
BellevueBob

Reputation: 9618

Extract your table from SQL Server into a delimited file; I'd suggest using tabs or pipes ("|") instead of commas. Do not put quotes around the values; format any DATE or TIME columns with a standard ISO date form. If possible, include a row at the top containing column names.

Once done, you can use PROC IMPORT to read the file into a SAS data set. PROC IMPORT is just a code generator which will inspect the input file and create an appropriate INPUT statement for processing.

If you get specific errors (as you say when you try option 1), ask again or update this question with the exact error message. Problems like this are very easy to solve.

Upvotes: 0

Related Questions