Reputation: 7
I have a character column which has dates (dd/mm/yyyy) in character format.
While applying filter (where clause), I need that these characters are recognized as dates in the where statement, without actually making any change to the existing column or without creating a new column.
How can I make this happen. Any help would be deeply appreciated.
Thank you.
Upvotes: 0
Views: 3740
Reputation: 3315
It is not great idea to store date as character value, it can lead to lot of data accuracy related issues and you may not even know that you have data issues for a long time. say someone enters wrong character date and you may not even know. it is always good to maintain date as date value rather than as character value
In your code Filter dates using like becomes little complex for dates. You can try below code which will work for you by using input statement in where clause
data have;
input id datecolumn $10.;
datalines;
1 20/10/2018
1 25/10/2018
2 30/10/2018
2 01/11/2018
;
proc sql;
create table want as
select * from have
where input(datecolumn, ddmmyy10.) between '20Oct2018'd and '30Oct2018'd ;
using like as shown below for above same code
proc sql;
create table want as
select * from have
/*include all dates which start with 2 */
where datecolumn like '2%' and datecolumn like '%10/2018'
or datecolumn = '30/10/2018';
Edit1:
looks like you have data quality issue and sample dataset is shown below. try this. Once again i want to say approach of storing dates as character values is not good and can lead to lot of issues in future.
data have;
input id datecolumn $10.;
datalines;
1 20/10/2018
1 25/10/2018
2 30/10/2018
2 01/11/2018
3 01/99/2018
;
proc sql;
create table want(drop=newdate) as
select *, case when input(datecolumn, ddmmyy10.) ne .
then input(datecolumn, ddmmyy10.)
else . end as newdate from have
where calculated newdate between '20Oct2018'd and '30Oct2018'd
;
or you can put your case statement without making and dropping new column as shown below.
proc sql;
create table want as
select * from have
where
case when input(datecolumn, ddmmyy10.) ne .
then input(datecolumn, ddmmyy10.) between '20Oct2018'd and '30Oct2018'd
end;
Upvotes: 0
Reputation: 27508
The SAS INPUT
function with a ?
informat modifier will convert a string (source value) to a result and not show an error if the source value is not conformant to the informat.
INPUT
can be used in a WHERE
statement or clause. The input can also be part of a BETWEEN
statement.
* some of these free form values are not valid date representations;
data have;
length freeform_date_string $10;
do x = 0 to 1e4-1;
freeform_date_string =
substr(put(x,z4.),1,2) || '/' ||
substr(put(x,z4.),3,2) || '/' ||
'2018'
;
output;
end;
run;
* where statement;
data want;
set have;
where input(freeform_date_string,? ddmmyy10.);
run;
* where clause;
proc sql;
create table want2 as
select * from have
where
input(freeform_date_string,? ddmmyy10.) is not null
;
* where clause with input used with between operator operands;
proc sql;
create table want3 as
select * from have
where
input(freeform_date_string,? ddmmyy10.)
between
'15-JAN-2018'D
and
'15-MAR-2018'D
;
quit;
Upvotes: 2
Reputation: 1269773
In proc sql
, you can come close with like
:
select (case when datecol like '__/__/____'
then . . .
else . . .
end)
This is only an approximation. _
is a wildcard that matches any character, not just numbers. On the other hand, this is standard SQL, so it will work in any database.
Upvotes: 2