Is there an easy way to see number of observations for each column in SAS?

I would like to achieve a similar outcome as pandas' info() in python. I want to see the number of elements for each column (missing / non-missing), like with proc contents, just for all of the variables. I'm sure there is a simple way, I just can't find it:) Could anyone help me with this?

Example for info( ):

df.info(verbose=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 3 columns):
 #   Column     Non-Null Count  Dtype
---  ------     --------------  -----
 0   int_col    5 non-null      int64
 1   text_col   5 non-null      object
 2   float_col  5 non-null      float64

Upvotes: 0

Views: 1238

Answers (1)

Tom
Tom

Reputation: 51566

I don't know what INFO() function produces (you should include an example in your question) but it sounds like you want the number of distinct values (LEVELS) for the variables. PROC FREQ can do that.

proc freq nlevels data=sashelp.cars;
   tables _all_ / noprint;
run;

Result

The FREQ Procedure

                     Number of Variable Levels

                                               Missing    Nonmissing
Variable       Label                Levels      Levels        Levels
--------------------------------------------------------------------
Make                                    38           0            38
Model                                  425           0           425
Type                                     6           0             6
Origin                                   3           0             3
DriveTrain                               3           0             3
MSRP                                   410           0           410
Invoice                                425           0           425
EngineSize     Engine Size (L)          43           0            43
Cylinders                                8           1             7
Horsepower                             110           0           110
MPG_City       MPG (City)               28           0            28
MPG_Highway    MPG (Highway)            33           0            33
Weight         Weight (LBS)            348           0           348
Wheelbase      Wheelbase (IN)           40           0            40
Length         Length (IN)              67           0            67

If you want to count missing/non-missing values then make a pair of special formats and use proc freq. SAS only has two types of variables, numbers and fixed length character strings. So make two formats, one for each type.

There is a worked example on SAS Communities : https://communities.sas.com/t5/SAS-Programming/Counting-missing-and-non-missing-obs/m-p/740272

*set input/output dataset names;
%let INPUT_DSN = sashelp.cars;
%let OUTPUT_DSN = want;

* create formats for missing;
proc format;
  value $missfmt ' ' = "Missing"  other = "Not_Missing";
  value nmissfmt low-high ="Not_Missing" other="Missing";
run;

* turn off output and capture the one way freq table TEMP dataset ; 
ods select none;
ods table onewayfreqs=temp;

proc freq data=&INPUT_DSN.;
  table _all_ / missing;
  format _numeric_ nmissfmt. _character_ $missfmt.;
run;

* turn outputs back on ;
ods select all;

* Collapse to one observation per variable ;
data &OUTPUT_DSN ;
  length name $32 missing not_missing total 8 ;
  set temp;
  by table notsorted ;
  if first.table then call missing(of missing not_missing);
  name = substr(table,7);
  if vvaluex(name)='Missing' then missing=frequency;
  else not_missing=frequency;
  retain missing not_missing;
  if last.table then do;
     missing=sum(0,missing);
     not_missing=sum(0,not_missing);
     total=sum(missing,not_missing);
     percent = divide(missing,total);
     output;
  end;
  keep name missing not_missing total percent;
run;

Resulting data:

enter image description here

Upvotes: 2

Related Questions