WeShall
WeShall

Reputation: 409

Regular Expression | REGEX for ICD9 codes

I am using Python to extract ICD9 codes. And am using the below regular expression

icdRegex = recomp('V\d{2}\.\d{1,2}|\d{3}\.\d{1,2}|E\d{3}\.\d')

It captures pattern similar to 137.98 or V35.62

Everything works fine except the expression also captures patient weights as ICD9 code.

Now what I observed is, the weight is almost always appears as ex: 110.67 kg or kgs or lb or lbs

How do I separate ICD9 from weight !?

Upvotes: 3

Views: 410

Answers (2)

WeShall
WeShall

Reputation: 409

Here is HamZa's expression for everyone:

icdRegex = recomp("\b(?:V\d{2}\.\d{1,2}|\d{3}\.\d{1,2}|E\d{3}\.\d)\b(?!\s*(?:kg|lb)s?\b)")

Thanks HamZa & Chapelo for helping out. Appreciate it.

Upvotes: 1

chapelo
chapelo

Reputation: 2562

Add a negative lookahead assertion like the follwing:

(V\d{2}\.\d{1,2}|\d{3}\.\d{1,2}|E\d{3}\.\d)\b(?!\s?(?:lb|kg)s?)

Upvotes: 1

Related Questions