Reputation: 243
Using SAS 9.3 I want to extract the portion of each string between the "." (dot) and '"' (double quote) that follows the dot. For example the result of line one below should be f2015_cnt_cont_line
<characteristic abc="[2015].f2015_cnt_cont_line" xxxxxxxx="8129" />
<characteristic abc="[2015].f2015_dbt_cont_line" xxxxxxxx="8134" />
<characteristic abc="[2015].f2015_ctl_tot_acct_bal" xxxxxxxx="8133" />
<characteristic abc="[2015].f2015_cnt_comb_line" xxxxxxxx="8118" />
<characteristic abc="[2015].f2015_dbt_comb_line" xxxxxxxx="8138" />
Does anyone have an example I could use?
Thanks Dan
Upvotes: 3
Views: 11694
Reputation: 1
That works for me
SUBSTR(
t1.field,
index(t1.field,'.')+1,
(index(t1.field,'"')-index(t1.field,'.')-1)
)
Upvotes: 0
Reputation: 11765
A regular expression that matches your pattern is \.(.*?)\"
. This means: find a dot (a special character; then any character (the ? makes it not "greedy", so it captures as few characters as possible); and then a quotation mark.
Using this example from the SAS documentation, something like this should work:
data test;
set _your_data_set;
retain re;
if _N_ = 1 then re = prxparse('/\.(.*?)\"/');
if prxmatch(re, var) then result = prxposn(re, 1, var);
run;
(This assumes your data is in a variable called var
.)
Upvotes: 4
Reputation: 1696
Instead of using the PRX functions, something like the following might suffice:
text=scan(scan(line,2,"."),1,'"');
This assumes the text is stored in a variable called line.
Upvotes: 3
Reputation: 9618
Here's one way:
inner = SCAN(SUBSTR(line,INDEX(line,'.')+1),1,'"');
The inner SUBSTR function skips to the column after the first dot; the outer SCAN function returns the first word delimited by a double quote.
Upvotes: 2