Parsing Time from text using SAS

Question

I have a maximum of 200 characters length of text for thousands of records which has date and time in it. I am trying to parse out the time

Here is the example of the text that I have

 Your account your account your account on  Jan 10, 2020 at 8.30 AM ET your account
 Your account your account your account on  Jan 3, 2020 6.30PM ET your account
 Your account your account your account on  Jan 11, 2020 at 6.30PM ET your account

Desired output

   8.30 AM
   6.30PM
   6.30PM

In all the content ET is common and I am using index and substr function to parse out the time.

   Time=substr(Text,index(Text,' on ')+19,6);

For the second line, I am also getting extra characters since there is no at and no space between time and PM

Is there any efficient way to parse the time?

Thanks

Richard · Accepted Answer

SAS can locate a text excerpt using a Perl regular expression that has a capture buffer.

data want(keep=parsed_timestring);
length parsed_timestring $8;
input;

  /* Pattern:
   *   On a word boundary     \b
   *   Capture start          (
   *     1 or 2 digits        \d{1,2}
   *     A period             \.
   *     1 or 2 digits        \d{1,2}
   *     0 or 1 spaces        \s?
   *     letter A or P        (A|P)
   *     letter M             M
   *   Capture end            )
   */

  prx = prxparse('/\b(\d{1,2}\.\d{1,2}\s?(A|P)M)/x');

  if prxmatch (prx, _infile_) then 
    parsed_timestring = prxposn(prx,1,_infile_);

datalines;
Your account your account your account on  Jan 10, 2020 at 8.30 AM ET your account
 Your account your account your account on  Jan 3, 2020 6.30PM ET your account
 Your account your account your account on  Jan 11, 2020 at 6.30PM ET your account
 Your account your account your account on  Jan 11, 2020 at 6666.30PM ET your account
;

proc print;
run;

In the last row parsed_timestring is blank because 6666.30PM starts with more than two digits on a word boundary and thus would not match the pattern and thus the line would not have a proper time string.

Parsing Time from text using SAS

Answers (1)

Related Questions