Bob
Bob

Reputation: 121

SAS - Size Char and Numeric

I am trying to estimate the size of all the tables in a SAS folder. Does anyone know the size of a character field with length =1 and numeric field with length = 1? Once that's figure out, I am planning to multiply the length of a column, by number of columns, and then by number of records to estimate the size of a table.

The statements above might be unclear, so I will use an example to clarify. Say character field with length = 1 is 1 byte, and numeric field with length = 1 byte, then 100 records and 2 columns, would be calculated as 200 bytes (1*2*100).

Thanks.

Upvotes: 2

Views: 1202

Answers (1)

Joe
Joe

Reputation: 63424

Character fields are one byte per length; e.g., length x $1 is one byte. However, they are constant width for every row, unless some compression is used - so if it is length x $8 but x='Hi', it still takes 8 bytes (technically, X='Hi '). Format often is used to define a default length for character variables, but it is possible to have a storage length different from formatted length (although it's usually a mistake).

Numeric fields are by default 8 bytes wide, regardless of formatted width (ie, format x BEST12. still takes 8 bytes to store, just like format x 2. would). You can change that, via length, to a smaller amount, although you lose precision; it can be as little as 3 bytes. It can never take more than 8 bytes in standard SAS (I think in DS2 you can now have bigger numerics?).

On the subject of estimating table size: If you have a table already created, you can determine its record length from PROC CONTENTS or dictionary.tables. The "Observation Length" (obslen) is the number of bytes used to store each observation (row); "Bufsize" is the buffer size, which determines the size of each page of data (rows are stored entirely on pages, not across pages, so you need to determine how many rows fit in a page, which is Bufsize/ObsLen.

Some additional overhead is needed to store the metadata, typically one page extra, but this gets you fairly close.

Several macros have been created for this purpose; user667489 links to one in comments titled A New Method to Estimate the Size of a SAS® Data Set from SUGI 27. Michael Raithel also created a macro for SAS that is linked in their documentation, Size_The_Data.sas.

Upvotes: 2

Related Questions