ryan
ryan

Reputation: 41

How to read an improperly formatted column from Excel using SAS

Before posting I tried Google but not a lot of help.

I have an Excel file with this value in the Opened_DT column: 1448300037,15. Excel only displays a bunch of ###### characters instead of the date.

I have been given this file and do not know what actual output should the date format be. It should state the date an account has been opened with an institution. I need to import this Excel column into a SAS data set, but this string drives me crazy :( I have tried bunch of conversion utilities, and addons but no luck.

Can someone show me how to read this column using SAS?

Here is an example of my Excel workbook

Upvotes: 4

Views: 3725

Answers (4)

barry houdini
barry houdini

Reputation: 46341

I assume that for your region decimal separators are commas so 1448300037,15 is a valid number......

If that represents the number of seconds from 1st Jan 1960 then in Excel you can convert to a valid date/time by adding the start date (1st Jan 1960) to your number expressed as days (which you can get by dividing by the number of seconds in a day), so with data in A2 use this formula

=DATE(1960,1,1)+A2/86400

and format as a date/time, e.g. dd.mm.yyyy hh:mm

Upvotes: 1

BellevueBob
BellevueBob

Reputation: 9618

It looks to me that the column in your Excel worksheet was created using some technique that translated decimal points to commas. The example value cited (1448300037,15) is equivalent to a SAS timestamp of 22NOV2005:17:33:57.15.

If that value is legitimate and you want to turn this into a SAS data set, I suggest formatting the column as TEXT in Excel, saving it as a new workbook, and reading that new workbook into SAS with PROC IMPORT. Then post-process the result to re-read that variable as a SAS datetime variable. To illustrate:

proc import out=have
            datafile="path-to-workbook\workbook.xlsx"
            replace
            dbms=excel;
     sheet="sheetname";
run;

data want;
   set have(rename=(Open_DT=prevOpen_DT));
   format Open_DT datetime21.2;
   Open_DT = input(prevOpen_DT,numx13.2);
   drop prevOpen_DT;
run;

The numx SAS informat used in the input statement reads the character variable and interprets the comma as the decimal point.

Upvotes: 1

Peter L.
Peter L.

Reputation: 7304

From your sample it looks that initial data is wrong: this is what Excel 2007 shows me:

Screenshot from Excel

Indeed, actual value is too large to be Date - for today's date serial number is 41335. Please make sure your application calculates correctly Opening_DT values.

As per Excel specifications and limits:

Latest date allowed for calculation         December 31, 9999
Largest amount of time that can be entered  9999:59:59

Serial for December 31, 9999 is 2958465.

Upvotes: 3

SteveP
SteveP

Reputation: 19093

Make the column width wider. Excel sometimes shows hashes if the data can't fit.

Upvotes: 0

Related Questions