Borderless.Nomad
Borderless.Nomad

Reputation: 761

PHP Regex for checking space or certain characters after string

I need a regex which can basically check for space, line break etc after string.

So conditions are,

  1. Allow special characters ., _, -, + inside the string i.e.@hello.world, @hello_world, @helloworld, etc.
  2. Discard anything including special characters where there is no alpha-numeric string after them i.e. @helloworld.<space>, @helloworld-<space>, @helloworld.?, etc. must be parsed as @helloworld

My existing RegEx is /@([A-Za-z0-9+_.-]+)/ which works perfectly Condition #1, but still there seems to be a problem Condition #2

I am using above RegEx in preg_replace()

Solution:

$str = preg_replace('#@[\w+.\-]+\b#', '[[$0]]', $str);

This works perfectly.

Tested with

http://gskinner.com/RegExr/

Upvotes: 2

Views: 434

Answers (3)

Jon
Jon

Reputation: 437434

Here's an idea:

  1. Use strrev to reverse the string
  2. Use strcspn to find the longest prefix of the reversed string that does not contain any alphanumeric characters
  3. Cut the prefix off with substr
  4. Reverse the string again; this is your final result

See it in action.

I 'm not taking into account any requirement that restricts the legal characters in the string to some subset, but you can use your regular expression for that (or even strspn, which might be faster).

Upvotes: 1

Kobi
Kobi

Reputation: 138037

You can use word boundaries to easily find the position between an alphanumeric letter and a non-alphanumeric letter:

$str = preg_replace('#@[\w+.\-]+\b#', '[[$0]]', $str);

Working example: http://ideone.com/0ShCm

Upvotes: 1

Brodie
Brodie

Reputation: 8747

The reason is because it's reading the string as a whole. If you want it to parse out everything after the alphanumeric section you might have to do like and end(explode()); and run that through to make sure that it isn't valid and if it isn't valid then remove it from the equation, but then you'd have to check the end for every possible explode point i.e. .,-,~,etc.

Then again another trap that you might run into is that in the case of a item or anything w/ alphanumeric value it might just parse everything from after the last alphanumeric character on.

Sorry that this isn't much help, but I figured thinking aloud does help.

Upvotes: 0

Related Questions