Christian
Christian

Reputation: 53

Regex issue with string variables

I've been working on this for too long now, and I can't seem to come up with a regex expression that solves this. I know I could use some other coding language to iterate through the characters, but I just wanted to do it with Stata, and not have to go into R or Python. Here goes:

The strings that I'm trying to parse generally have a setup as follows:

Name (Type / $## Million / #### )

where sometimes the final end parenthesis is missing, but if it is, the last character is the end of the string. I want to be able to match the contents of the parenthesis, but the problem is that sometimes Name contains a parenthetical, like

Bank (other) (... / ... / ...)

Also, sometimes Type has a parenthetical as well, like

Name (Loan (other) / ... / ...)

The basic idea is that I'm looking for the contents of the set of parenthesis that contain two forward slashes separated by other characters. Any idea how to do this?

The best I've come up with so far is:

\(([^\)]*\/[^\)]*\/.*\)?)$

But it runs into a problem when there is a parenthesis inside the set that I want to grab. Any help would be greatly appreciated. Here are a few sample lines. Each line should be treated as a new string.

IFC (Equity / $12 Million / 1993
IFC (Equity / $28 Million / 1995)
IFC (Loan / $30 Million / 1995
IFC (Syndication / $40 Million / 1995)
BOAD (Loan / $7 Million / 1995
IFC (Equity / $5 Million / 1997)
IFC (Loan / $13 Million / 1997
MIGA (Guarantees Only) (Guarantee / $30 Million / 1995)
IFC (Equity / $2 Million / 1997
IFC (Syndication / $3 Million / 1997
IFC (Equity / $1 Million / 1998
IFC (Syndication / $12 Million / 1998
IFC (Quasi-equity / $7 Million / 1998
IFC (Risk Management (including Political Risk Insurance) / $1 Million / 1994)

Upvotes: 0

Views: 43

Answers (1)

Nick Cox
Nick Cox

Reputation: 37368

I have let your expression of frustration stand: usually it would be edited out as irrelevant to the technical problem, but here looking for a too complicated solution is part of the problem. I have often seen people fixated on a search for a Grail-like regular expression when applying basic string commands and functions would crack their problem.

Here is a way in. Some further editing of strings seems likely for which split again and subinstr() are tools of choice.

clear 
input str80 mydata 
"IFC (Equity / $12 Million / 1993"
"IFC (Equity / $28 Million / 1995)"
"IFC (Loan / $30 Million / 1995"
"IFC (Syndication / $40 Million / 1995)"
"BOAD (Loan / $7 Million / 1995"
"IFC (Equity / $5 Million / 1997)"
"IFC (Loan / $13 Million / 1997"
"MIGA (Guarantees Only) (Guarantee / $30 Million / 1995)"
"IFC (Equity / $2 Million / 1997"
"IFC (Syndication / $3 Million / 1997"
"IFC (Equity / $1 Million / 1998"
"IFC (Syndication / $12 Million / 1998"
"IFC (Quasi-equity / $7 Million / 1998"
"IFC (Risk Management (including Political Risk Insurance) / $1 Million / 1994)"
end 

split mydata, parse(/) 
rename (mydata?) (what howmuch when)  
destring when, ignore(")") replace 

list what how when

     +------------------------------------------------------------+
  1. |                                                       what |
     |                                               IFC (Equity  |
     |------------------------------------------------------------|
     |                howmuch           |          when           |
     |           $12 Million            |          1993           |
     +------------------------------------------------------------+

     +------------------------------------------------------------+
  2. |                                                       what |
     |                                               IFC (Equity  |
     |------------------------------------------------------------|
     |                howmuch           |          when           |
     |           $28 Million            |          1995           |
     +------------------------------------------------------------+

     +------------------------------------------------------------+
  3. |                                                       what |
     |                                                 IFC (Loan  |
     |------------------------------------------------------------|
     |                howmuch           |          when           |
     |           $30 Million            |          1995           |
     +------------------------------------------------------------+

     +------------------------------------------------------------+
  4. |                                                       what |
     |                                          IFC (Syndication  |
     |------------------------------------------------------------|
     |                howmuch           |          when           |
     |           $40 Million            |          1995           |
     +------------------------------------------------------------+

     +------------------------------------------------------------+
  5. |                                                       what |
     |                                                BOAD (Loan  |
     |------------------------------------------------------------|
     |                howmuch           |          when           |
     |            $7 Million            |          1995           |
     +------------------------------------------------------------+

     +------------------------------------------------------------+
  6. |                                                       what |
     |                                               IFC (Equity  |
     |------------------------------------------------------------|
     |                howmuch           |          when           |
     |            $5 Million            |          1997           |
     +------------------------------------------------------------+

     +------------------------------------------------------------+
  7. |                                                       what |
     |                                                 IFC (Loan  |
     |------------------------------------------------------------|
     |                howmuch           |          when           |
     |           $13 Million            |          1997           |
     +------------------------------------------------------------+

     +------------------------------------------------------------+
  8. |                                                       what |
     |                         MIGA (Guarantees Only) (Guarantee  |
     |------------------------------------------------------------|
     |                howmuch           |          when           |
     |           $30 Million            |          1995           |
     +------------------------------------------------------------+

     +------------------------------------------------------------+
  9. |                                                       what |
     |                                               IFC (Equity  |
     |------------------------------------------------------------|
     |                howmuch           |          when           |
     |            $2 Million            |          1997           |
     +------------------------------------------------------------+

     +------------------------------------------------------------+
 10. |                                                       what |
     |                                          IFC (Syndication  |
     |------------------------------------------------------------|
     |                howmuch           |          when           |
     |            $3 Million            |          1997           |
     +------------------------------------------------------------+

     +------------------------------------------------------------+
 11. |                                                       what |
     |                                               IFC (Equity  |
     |------------------------------------------------------------|
     |                howmuch           |          when           |
     |            $1 Million            |          1998           |
     +------------------------------------------------------------+

     +------------------------------------------------------------+
 12. |                                                       what |
     |                                          IFC (Syndication  |
     |------------------------------------------------------------|
     |                howmuch           |          when           |
     |           $12 Million            |          1998           |
     +------------------------------------------------------------+

     +------------------------------------------------------------+
 13. |                                                       what |
     |                                         IFC (Quasi-equity  |
     |------------------------------------------------------------|
     |                howmuch           |          when           |
     |            $7 Million            |          1998           |
     +------------------------------------------------------------+

     +------------------------------------------------------------+
 14. |                                                       what |
     | IFC (Risk Management (including Political Risk Insurance)  |
     |------------------------------------------------------------|
     |                howmuch           |          when           |
     |            $1 Million            |          1994           |
     +------------------------------------------------------------+

Upvotes: 2

Related Questions