Jordan Johnson
Jordan Johnson

Reputation: 253

Using Regex to Match Pattern

I am trying to use regex to retrieve Title:Code pair.

(.*?\(CPT-.*?\)|.*?\(ICD-.*?\))

Data:

SENSORINEURAL HEARING LOSS BILATERAL (MILD) (ICD-389.18) RIGHT WRIST GANGLION CYST (ICD-727.41) S/P INJECTION OF DEPO MEDROL INTO LEFT SHOULDER JOINT (CPT-20600)

I would like to capture:

What is the proper regex to use?

Upvotes: 2

Views: 705

Answers (3)

p.s.w.g
p.s.w.g

Reputation: 149058

What about a pattern like this:

.*?\((CPT|ICD)-[A-Z0-9.]+\)

This will match zero or more of any character, non-greedily, followed by a ( followed by either CPT or ICD, followed by a hyphen, followed by one or more Uppercase Latin letters, decimal digits or periods, followed by a ).

Note that I picked [A-Z0-9.]+ because, to my understanding, all current ICD-9 codes , ICD-10 codes, and CPT codes conform to that pattern.

The C# code might look a bit like this:

var result = Regex.Matches(input, @".*?\((CPT|ICD)-[A-Z0-9.]+\)")
                  .Cast<Match>()
                  .Select(m => m.Value);

If you want to avoid having any surrounding whitespace, you simply trim the result strings (m => m.Value.Trim()), or ensure that the matched prefix starts with a non-whitespace character by putting a \S in front, like this:

var result = Regex.Matches(input, @"\S.*?\((CPT|ICD)-[A-Z0-9.]+\)")
                  .Cast<Match>()
                  .Select(m => m.Value);

Or using a negative lookahead if you need to handle inputs like (ICD-100)(ICD-200):

var result = Regex.Matches(input, @"(?!\s).*?\((CPT|ICD)-[A-Z0-9.]+\)")
                  .Cast<Match>()
                  .Select(m => m.Value);

You can see a working demonstration here.

Upvotes: 4

gpmurthy
gpmurthy

Reputation: 2427

Consider the following Regex...

.*?\d\)

Good Luck!

Upvotes: 0

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89629

You can use the split() method:

string input = "SENSORINEURAL HEARING LOSS BILATERAL (MILD) (ICD-389.18) RIGHT WRIST GANGLION CYST (ICD-727.41) S/P INJECTION OF DEPO MEDROL INTO LEFT SHOULDER JOINT (CPT-20600)";
string pattern = @"(?<=\))\s*(?=[^\s(])";
string[] result = Regex.Split(input, pattern);

Upvotes: 1

Related Questions