Flea
Flea

Reputation: 11284

How to return the first five digits using Regular Expressions

How do I return the first 5 digits of a string of characters in Regular Expressions?

For example, if I have the following text as input:

15203 Main Street Apartment 3 63110

How can I return just "15203".

I am using C#.

Upvotes: 4

Views: 8511

Answers (7)

Paul Nathan
Paul Nathan

Reputation: 40309

A different approach -

#copy over
$temp = $str;
#Remove non-numbers
$temp =~ s/\D//;
#Get the first 5 numbers, exactly.
$temp =~ /\d{5}/;
#Grab the match- ASSUMES that there will be a match.
$first_digits = $1

Upvotes: 1

Adrian Regan
Adrian Regan

Reputation: 2250

result =~ s/^(\d{5}).*/$1/

Replace any text starting with a digit 0-9 (\d) exactly 5 of them {5} with any number of anything after it '.*' with $1, which is the what is contained within the (), that is the first five digits.

if you want any first 5 characters.

result =~ s/^(.{5}).*/$1/

Use whatever programming language you are using to evaluate this.

ie.

regex.replace(text, "^(.{5}).*", "$1");

Upvotes: -1

hobbs
hobbs

Reputation: 239890

This isn't really the kind of problem that's ideally solved by a single-regex approach -- the regex language just isn't especially meant for it. Assuming you're writing code in a real language (and not some ill-conceived embedded use of regex), you could do perhaps (examples in perl)

# Capture all the digits into an array
my @digits = $str =~ /(\d)/g;
# Then take the first five and put them back into a string
my $first_five_digits = join "", @digits[0..4];

or

# Copy the string, removing all non-digits
(my $digits = $str) =~ tr/0-9//cd;
# And cut off all but the first five
$first_five_digits = substr $digits, 0, 5;

If for some reason you really are stuck doing a single match, and you have access to the capture buffers and a way to put them back together, then wdebeaum's suggestion works just fine, but I have a hard time imagining a situation where you can do all that, but don't have access to other language facilities :)

Upvotes: 6

kniemczak
kniemczak

Reputation: 359

Not sure this is best solved by regular expressions since they are used for string matching and usually not for string manipulation (in my experience).

However, you could make a call to: strInput = Regex.Replace(strInput, "\D+", ""); to remove all non number characters and then just return the first 5 characters.

If you are wanting just a straight regex expression which does all this for you I am not sure it exists without using the regex class in a similar way as above.

Upvotes: 1

Michael Paulukonis
Michael Paulukonis

Reputation: 9100

I don't think a regular expression is the best tool for what you want.

Regular expressions are to match patterns... the pattern you are looking for is "a(ny) digit"

Your logic external to the pattern is "five matches".

Thus, you either want to loop over the first five digit matches, or capture five digits and merge them together.

But look at that Perl example -- that's not one pattern -- it's one pattern repeated five times.

Can you do this via a regular expression? Just like parsing XML -- you probably could, but it's not the right tool.

Upvotes: 2

wdebeaum
wdebeaum

Reputation: 4211

You could capture each digit separately and put them together afterwards, e.g. in Perl:

$str =~ /(\d)\D*(\d)\D*(\d)\D*(\d)\D*(\d)/;
$digits = $1 . $2 . $3 . $4 . $5;

Upvotes: 2

AllenG
AllenG

Reputation: 8190

it would depend on your flavor of Regex and coding language (C#, PERL, etc.) but in C# you'd do something like

string rX = @"\D+";
Regex.replace(input, rX, "");
return input.SubString(0, 5);

Note: I'm not sure about that Regex match (others here may have a better one), but basically since Regex itself doesn't "replace" anything, only match patterns, you'd have to look for any non-digit characters; once you'd matched that, you'd need to replace it with your languages version of the empty string (string.Empty or "" in C#), and then grab the first 5 characters of the resulting string.

Upvotes: 5

Related Questions