strudelkopf
strudelkopf

Reputation: 681

PHP regex replacement doesn't match

I'm using this regex to get house number of a street adress.

[a-zA-ZßäöüÄÖÜ .]*(?=[0-9])

Usually, the street is something like "Ohmstraße 2a" or something. At regexpal.com my pattern matches, but I guess preg_replace() isn't identical with it's regex engine.

$num = preg_replace("/[a-zA-ZßäöüÄÖÜ .]*(?=[0-9])/", "", $num);

Update: It seems that my pattern matches, but I've got some encoding problems with the special chars like äöü

Update #2: Turns out to be a encoding problem with mysqli.

Upvotes: 0

Views: 106

Answers (3)

TiMESPLiNTER
TiMESPLiNTER

Reputation: 5889

First of all if you want to get the house number then you should not replace it. So instead of preg_replace use preg_match.

I modified your regex a little bit to match better:

$street = 'Öhmsträße 2a';

if(preg_match('/\s+(\d+[a-z]?)$/i', trim($street), $matches) !== 0) {
    var_dump($matches);
} else {
    echo 'no house number';
}
  1. \s+ matches one or more space chars (blanks, tabs, etc.)
  2. (...) defines a capture group which can be accesses in $matches
  3. \d+ matches one or more digits (2, 23, 235, ...)
  4. [a-z] matches one character from a to z
  5. ? means it's optional (not every house number has a letter in it)
  6. $ means end of string, so it makes sure the house number is at the end of the string

Make sure you strip any spaces after the end of the house number with trim().

Upvotes: 3

vogomatix
vogomatix

Reputation: 5041

I feel this may be a character set or UTF-8 issue.

It would be a good idea to find out what version of PHP you're running too. If I recall correctly, full Unicode support came in around 5.1.x

Upvotes: 0

Lajos Veres
Lajos Veres

Reputation: 13725

The u modifier can help sometimes for handling "extra" characters.

Upvotes: 1

Related Questions