Umar
Umar

Reputation: 113

regex capture certain characters only

currently dealing with a bit of a problem. this is my string "all-days" im in need of some assistance to creating a regex to capture the first character, the dash and also the first character after the dash. Im a bit of a newbie to Regex so forgive me.

Here is what ive got so far. (^.)

Upvotes: 0

Views: 75

Answers (4)

ctwheels
ctwheels

Reputation: 22817

Code

See code in use here

\b(\w|-\b)

For more precision, the following can be used (note that it uses Unicode groups, so it doesn't work in every language, but it does in PHP). This will only match letters, not numbers and underscores. It uses a negative lookbehind and positive lookahead, but you can understand it if you keep reading this article and break it apart one piece at a time.

(\b\p{L}|(?<=\p{L})-(?=\p{L}))

Explanation

  • \b Assert position at a word boundary
  • (\w|-\b) Capture the following into capture group 1
    • \w Match any word character
    • | Or
    • - Match the - character literally
    • \b Assert position at a word boundary

\b:

  • Asserts the position in the string matches 1 of the following:
    • ^\w Assert position at the start of the string and match a word character
    • \w$ Match a word character and assert its position as the last position in the string
    • \W\w Match any non-word character, followed by a word character
    • \w\W Match any word character, followed by a non-word character

\w:

  • Means a word character (usually defined by any character in the set a-zA-Z0-9_, however, some languages also accept Unicode characters that represent any letter, number, or underscore \p{L}\p{N}_).
  • For more precision (depending on the use-case), you can specify [a-zA-Z] (for ASCII letters), \p{L} for Unicode letters, or [a-z] with the i flag for ASCII characters with the case-insensitive flag enabled in regex.

Upvotes: 0

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89547

I assume your string only contains word characters and hyphens, and doesn't have consecutive hyphens:

To remove all that isn't the first character the hyphens and the first character after them, remove all that isn't after a word boundary:

$result = preg_replace('~\B\w+~', '', 'all-days');

If you only want to match these characters, just catch each character after a word boundary:

if ( preg_match_all('~\b.~', 'all-days', $matches) )
    print_r($matches[0]);

Upvotes: 0

Niklesh Raut
Niklesh Raut

Reputation: 34914

Its not regex but If you want just a solution as you want by other way it can be achieve by explode, array_walk and implode

$string = 'all-days-with-my-style';
$arr = explode("-",$string);
$new = array_walk($arr,function(&$a){
 $a  = $a[0];
});
echo implode("-",$arr);

Live demo : https://eval.in/882846

Output is : a-d-w-m-s

Upvotes: 1

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

capture the first character, the dash and also the first character after the dash

With preg_match function:

$s = "all-days";
preg_match('/^(.)[^-]*(-)(.)/', $s, $m);
unset($m[0]);

print_r($m);

The output:

Array
(
    [1] => a
    [2] => -
    [3] => d
)

Upvotes: 1

Related Questions