EvilChookie
EvilChookie

Reputation: 573

preg_replace - leaving in unwanted characters

I've got a string:

$string = "Hello World!";

I want to turn it into a URL friendly tag, and I've developed a function to do it:

function stripJunk($string){
    $string = str_replace(" ", "-", $string);
    $string = preg_replace("/[^a-zA-Z]\s/", "", $string);
    $string = strtolower($string);
    return $string;
}

However, when I run my $string through it above, I get the following:

$string = "hello-world!";

It seems that there are characters slipping through my preg_replace, even though from what I understand, they shouldn't be.

It should read like this:

$string = "hello-world";

What's going on here? (This should be easy peasy lemon squeasy!)

Edit 1: I wasn't aware that regular expressions were beginners stuff, but whatever. Additionally, removing the \s in my string does not produce the desired result.

The desired result is:

  1. All spaces are converted to dashes.
  2. All remaining characters that are not A-Z or 0-9 are removed.
  3. The string is then converted to lower case.

Edit 2+: Cleaned up my code just a little.

Upvotes: 0

Views: 7083

Answers (4)

Tom Haigh
Tom Haigh

Reputation: 57815

The \s at the end of your pattern means that you will only replace non-alphabetical characters which are immediately followed by a whitespace character. You probably want the \s within the square brackets so that whitespace is also preserved and can later be replaced with a dash.

You will need to add 0-9 inside the square brackets if you want to also allow numbers.

For example:

<?php

$string = "Hello World!";

function stripJunk($string){
    $string = preg_replace("/[^a-zA-Z0-9\s]/", "", $string);
    $string = str_replace(" ", "-", $string);
    $string = strtolower($string);
    return $string;
}

echo stripJunk($string);

Upvotes: 3

Thijs
Thijs

Reputation: 618

You could use some regular expressions in a row to remove the junk:

<?php

function strip_junk ($string) {

  // first, strip whitespace; and replace every non-alphabetic character by a dash
  $string = preg_replace("/[^a-z0-9-]/u", "-", strtolower(trim($string)));

  // second, remove double dashes
  $string = preg_replace("/-+/u", "-", $string);

  // finally, remove leading and trailing dashes
  $string = preg_replace("/^-*|-*$/u", "", $string);

  return $string;

}

?>

This should do the trick, happy PHP'ing!

Upvotes: 1

SilentGhost
SilentGhost

Reputation: 319601

The following works just fine to me:

function stripJunk($string){
    $string = str_replace(" ", "-", trim($string));
    $string = preg_replace("/[^a-zA-Z0-9-]/", "", $string);
    $string = strtolower($string);
    return $string;
}

Upvotes: 5

user142019
user142019

Reputation:

What about this?

preg_replace("/[.\n\r][^a-zA-Z]/", "", $string);

if that does not work:

preg_replace("/[.\n\r^a-zA-Z]/", "", $string);

Does that work?

Upvotes: 0

Related Questions