sazr
sazr

Reputation: 25928

Grab everything after character if its present

I have a regular expression that grabs the suburb from a string that usually contains a suburb and industry in the format:

INDUSTRY - SUBURB

Sometimes the string may not contain the INDUSTRY - part and just have the suburb. In this case my regular expression fails to grab anything.

Is there a way to make the regex robust enough to grab everything after the hypen if its present otherwise just grab everything?

The following regex doesn't work: (- |^)(.*)(,|$)

The result is: dvertising - Roseville Chase

Upvotes: 1

Views: 421

Answers (4)

KCzar
KCzar

Reputation: 1044

Instead of using (.*), use ([^-]*):

(- |^)([^-]*)(,|$)

In action:

import re

re.search(r"(- |^)([^-]*)(,|$)", "Advertising - Roseville Chase").group(2)
Out[97]: 'Roseville Chase'

re.search(r"(- |^)([^-]*)(,|$)", "Roseville Chase").group(2)
Out[98]: 'Roseville Chase'

*More explanation was requested:

[^-] means "any character except for -". By using [^-], you are making it impossible for the regex to match the entire string if there is a hyphen present. It will have to match everything after the hyphen.

Upvotes: 2

Jon Clements
Jon Clements

Reputation: 142126

Well... it's much easier to do this not using a regex, I have to sit and grok the other answers and that's not what Python's about - I agree with Robert.

I'd just go for:

def suburb_or_all(text):
    industry, hyphen_present, suburb = text.partition(' - ')
    return suburb if hypen_present else text

Completely readable, self-documenting and remarkably efficient.

Upvotes: 1

GHETTO.CHiLD
GHETTO.CHiLD

Reputation: 3416

You could do this: (?<=-\s)(.*) which would return everything after the -. You can try it out here.example

Upvotes: -1

matt
matt

Reputation: 1915

Have two groups: one for the industry plus hyphen, and one for the suburb. Make the industry group optional with a question mark.

pattern = re.compile(r"([^-]*-)?(.*)")
pattern.match("Advertising - Roseville Chase").group(2)
pattern.match("Amityville").group(2)

Upvotes: 3

Related Questions