Reputation: 125
I'm trying to find a way to identify the subset of numeric variables that represent dates, times or datetimes from an arbitrary dataset that contains many variables of both numeric or character type. The goal is to have a macro or template that I could run against any given dataset. I believe that the SAS formats for these variables should correctly represent what the variables are, but I'm trying to see if I can avoid having to create some code that parses the text of the SAS format for any of the applicable format names.
I was originally thinking it should be possible to do something like the following to get the SAS formats for a dataset and then do various things based on the columns that come back as date/time/datetime.
PROC SQL;
CREATE TABLE lib_X.Dataset_A AS
SELECT tbl_CL.LIBNAME
,tbl_CL.MEMNAME AS TABLE_NAME
,tbl_CL.NAME AS COLUMN_NAME
,tbl_CL.FORMAT AS SAS_Format
,tbl_CL.TYPE AS SAS_Type
,tbl_CL.varnum
,tbl_CL.LENGTH
FROM DICTIONARY.COLUMNS AS tbl_CL
WHERE tbl_CL.MEMNAME = %UPCASE("&SAS_DatasetNm")
AND tbl_CL.LIBNAME = %UPCASE("&SAS_LibNm");
QUIT;
Followed by:
DATA lib_X.Dataset_A;
SET lib_X.Dataset_A;
IF FCN_FORMAT_TYPE(SAS_Format) = "DATETIME"
THEN ...;
RUN;
Where the FCN_FORMAT_TYPE
function is some function that examines the SAS format and returns which of the 4 categories of SAS Format (character, numeric, date/time, or ISO 8601) the variable is, but there doesn't seem to be an existing function that does this that I can find. Perhaps there's another approach that would work?
Currently running on SAS 9.4 M2 on a Linux server. The code will primarily be used via batch file, but possibly via Enterprise Guide as well (EG 7.1 or 6.1)
Upvotes: 2
Views: 770
Reputation: 12691
We wrote an fcmp function that does exactly this, taking ALL of the DATE / DATETIME / TIME formats from the SAS documentation.
The source is on github so you can easily add any missing formats: https://github.com/sasjs/core/blob/main/fcmp/mcf_getfmttype.sas
Docs: https://core.sasjs.io/mcf__getfmttype_8sas.html
Example macro invocation:
%put fmt_type=%sysfunc(mcf_getfmttype(TIME9.));
Upvotes: 0
Reputation: 125
Dominic's answer gave me an idea about how to solve it, using the same general approach, but putting it in a PROC FCMP function instead of a macro.
PROC FCMP OUTLIB=LIB_X.FCMP_FUNCS.Format_Category_ID;
FUNCTION Format_Category(VAR_Format $, VAR_Type $) $ 16
GROUP = 'Format Category ID'
LABEL = 'Pass SAS variable type and format as CHAR and it will return what sort of variable it is.
Breaks down date/time/dt formats into independent categories.';
LENGTH FORMAT_CATEGORY $16;
SELECT;
WHEN (MISSING(VAR_Format)) FORMAT_CATEGORY = 'MISSING_FORMAT';
WHEN (VAR_Type = 'CHAR') FORMAT_CATEGORY = 'CHARACTER_FORMAT';
WHEN (PRXMATCH("/S370FZDT/", VAR_Format)) FORMAT_CATEGORY = 'NUMERIC_FORMAT';
WHEN (PRXMATCH("/8601/", VAR_Format)) FORMAT_CATEGORY = 'ISO_8601_FORMAT';
WHEN (PRXMATCH("/(DT|DATEAMPM|DATETIME)/", VAR_Format)) FORMAT_CATEGORY = 'DATETIME_FORMAT';
WHEN (PRXMATCH("/(HHMM|HOUR|MMSS|TIME|TOD)/", VAR_Format)) FORMAT_CATEGORY = 'TIME_FORMAT';
WHEN (PRXMATCH("/(DAT|DAY|YY|DOW|JUL|MON|QTR|WEEK|YEAR)/", VAR_Format)) FORMAT_CATEGORY = 'DATE_FORMAT';
WHEN (VAR_Type = 'NUM') FORMAT_CATEGORY = 'NUMERIC_FORMAT';
OTHERWISE FORMAT_CATEGORY = 'UNKNOWN';
END;
RETURN(FORMAT_CATEGORY);
ENDSUB;
It's definitely not bullet-proof, but it worked for my purposes and will hopefully be useful to others. The biggest caveat I'd add is that handling user-defined formats will be touch and go since it's simply parsing the SAS format which is passed as a character variable, so if the name overlaps with the regular expressions I'm using, all bets are off.
Upvotes: 1
Reputation: 10411
No out-of-the-box solution, but here's a workable one.
For simplicity, let's work with dates only, excluding time and datetime formats. The solution can easily be generalized to include those.
As a first step, you work out the filters that will select the plausible date formats you can expect. Here's an example of a query that will match about 100 SAS date formats.
proc sql;
create table fmts as
select fmtname
from sashelp.vformat
where fmttype = "F"
and fmtname not like '$%'
and prxmatch("/(DATE|YY|YEAR)/",fmtname)
and not prxmatch("/(DT|TIME)/",fmtname)
order by fmtname;
quit;
Once you've fine-tuned your WHERE filter, you can use it in a macro routine like the following, where the 3rd argument vmname
is the name of the global macro variable you wish to store the date column names in.
%macro getDateCols(libname, dataset, vmname);
%global &vmname;
proc sql noprint;
select name
into :&vmname separated by ", " /* Use ", " or " " depending on your needs */
from dictionary.columns
where libname = upcase("&libname")
and memname = upcase("&dataset")
and format not like '$%'
and prxmatch("/(DATE|YY|YEAR|MONTH|DAY|WEEK)/", format)
and not prxmatch("/(DT|TIME)/", format);
quit;
%mend getDateCols;
After calling the macro, you should get the column name HireDate stored in &dsdates
:
%getDateCols(sasuser, empdata, dsdates);
%put &dsdates;
Then you can use the macro variable in proc sql, for example...
proc sql;
select &dsdates
from sasuser.empdata;
quit;
Upvotes: 1