Reputation: 31
If I have a string like this
Newsflash: The Big(!) Brown Dog's Brother (T.J.) Ate The Small Blue Egg
how would I convert that into the following using regex:
newsflash-the-big-brown-dogs-brother-tj-ate-the-small-blue-egg
In other words, punctuation is discarded and spaces are replaced with hyphens.
Upvotes: 3
Views: 3741
Reputation: 10136
It sounds like you want to create a "URL plug" -- a URL-friendly version of an article's title, for example. That means you'll want to make sure you remove all possible non-URL-friendly characters, not just a few. You might do it this way (in order):
Remove all non-letter non-number non-space characters by:
Replacing regex [^A-Za-z0-9 ]
with the empty string ""
.
Replace all spaces with a dash by:
Replacing regex \s+
with the string "-"
.
Lower-case the string by:
Java s = s.toLowerCase();
JavaScript s = s.toLowerCase();
C# s = s.ToLowerCase();
Perl $s = lc($s);
Python s = s.lower()
PHP $s = strtolower($s);
Ruby s = s.downcase
Upvotes: 5
Reputation: 4403
Replace /\W+/
with '-', that will replace all non-word characters with a dash.
Then, collapse dashes by replacing /-+/
with '-'.
Then, lowercase the string - pure regex solutions cannot do that. You didn't say which language you are using, so I cannot give you an example, but your language might have String.toLowercase() or a tr///
call (tr/A-Z/a-z/
, for example, in Perl).
Upvotes: 0
Reputation: 887817
Replace the regex [\s-]+
with "-"
, then replace [^\w-]
with ""
.
Then, call ToLowerCase
or equivalent.
In Javascript:
var s = "Newsflash: The Big(!) Brown Dog's Brother (T.J.) Ate The Small Blue Egg";
alert(s.replace(/[\s+-]/g, '-').replace(/[^\w-]/g, '').toLowerCase());
Upvotes: 1